Many times there are multiple data sets that we’d like to download from a website. Sometimes we are fortunate enough that the data is stored in multiple csv files within a zipped folder or under a common link. But what if the data is not centrally stored in a single zipped file or is not in csv format? For this post we will extract data that is stored in multiple pdf files from the Florida Division of Elections website.
Let’s utilize the html_nodes and html_attr functions from the rvest package to extract all links from the webpage then zoom in on only the pdf documents.
library(tidyverse)
library(tabulizer)
library(tabulizerjars)
library(knitr)
library(kableExtra)
library(rvest)
library(reshape)
library(scales)
url <- "http://dos.myflorida.com/elections/data-statistics/elections-data/general-election-summaries/"
#extract link names from page
links <- read_html(url) %>%
html_nodes("a") %>%
html_attr("href")
#detect pdf files we need
pdf_links <- links[str_detect(links, ".pdf")]
pdf_links
## [1] "/media/697842/2016-ge-summaries-ballots-by-type-activity.pdf"
## [2] "/media/697841/2016-ge-summaries-provisional-voting.pdf"
## [3] "/media/697839/2016-ge-summaries-poll-workers.pdf"
## [4] "/media/697840/2016-ge-summaries-precincts-and-polling-sites.pdf"
## [5] "/media/694976/2014ballotscast.pdf"
## [6] "/media/694977/2014provisionalballots.pdf"
## [7] "/media/694978/2014pollworkers.pdf"
## [8] "/media/694975/2014pollingsites.pdf"
## [9] "/media/693340/2012ballotscast.pdf"
## [10] "/media/693345/2012provisionalballots.pdf"
## [11] "/media/693338/2012pollworkers.pdf"
## [12] "/media/693339/2012pollingsites.pdf"
## [13] "/media/693353/2010ballotscast.pdf"
## [14] "/media/693336/2010provisionalballots.pdf"
## [15] "/media/693337/2010pollworkers.pdf"
## [16] "/media/693355/2010pollingsites.pdf"
## [17] "/media/693351/2008ballotscast.pdf"
## [18] "/media/693354/2008provisionalballots.pdf"
## [19] "/media/693352/2008pollworkers.pdf"
## [20] "/media/693349/2008pollingsites.pdf"
## [21] "/media/693346/2006ballotscast.pdf"
## [22] "/media/693348/2006provisionalballots.pdf"
## [23] "/media/693350/2006pollworkers.pdf"
## [24] "/media/693347/2006pollingsites.pdf"
## [25] "/media/693341/2004ballotscast.pdf"
## [26] "/media/693836/2004provisionalballots.pdf"
## [27] "/media/693343/2004pollworkerinfo.pdf"
## [28] "/media/693342/2004pollingsites.pdf"
Based on the output we see that there are 28 unique links with the .pdf extension. Our main focus is the voter turnout data for the 2004 to 2016 general elections. This data is contained in the ballotscast.pdf for 2004 to 2014 and in summaries-ballots-by-type-activity.pdf for 2016. Now that we know which links contain the relevant data we can extract them using str_detect. Let’s use ballotscast and type as keywords for identifying the necessary pdf files.
links_data <- pdf_links[str_detect(pdf_links,"ballotscast|type")]
links_data
## [1] "/media/697842/2016-ge-summaries-ballots-by-type-activity.pdf"
## [2] "/media/694976/2014ballotscast.pdf"
## [3] "/media/693340/2012ballotscast.pdf"
## [4] "/media/693353/2010ballotscast.pdf"
## [5] "/media/693351/2008ballotscast.pdf"
## [6] "/media/693346/2006ballotscast.pdf"
## [7] "/media/693341/2004ballotscast.pdf"
Great! We are now left with seven links. Upon looking at the pdf files, directly from the website, we notice that the files prior to 2012 are formatted differently than the files for 2014 and 2016. Additionally, the 2012 file is formatted differently than all the other files. This makes it difficult to import all the files in one pull so we will have to import and tidy the data in three sections. The first section will focus on the 2014 and 2016 files, the second on the 2012 file and the third on the 2004 to 2010 files. We’ll extract the data from each file, convert each file to a data frame, tidy it and finally combine all the data frames into one. Now who’s ready to have some fun?!
The first step will be to read in relevant files corresponding to this time frame. This can be achieved in three steps:
read_url function to extract the data from the pdf files, turn it into a list of data frames then append the year to each data frameread_url to save each file as a data frame#exclude 2012 link as this file is formatted differently than others
all_links <- links_data[-3]
#only 2012 link
link_2012 <- links_data[3]
#create read_url function #
read_url <- function(url){
df <- extract_tables(url)
df <- do.call(rbind, df) %>%
data.frame() #create a dataframe for each file within the list
df <- cbind(df,'year' = substr(url,39,42)) #get year for each df
}
#apply function to each link
for(i in seq_along(all_links)){
df <- read_url(paste0("http://dos.myflorida.com",all_links[i]))
assign(paste0("df", i),df)
}
Executing read_url within the for loop created six data frames df1, df2,…, df6 corresponding to the years 2004 to 2016 but excluding 2012. Thus, df1 represents 2016 data, df2 represents 2014 data, df3 represents 2010 data and so on.
Inspect Data
Lets look at the rows of df1 to determine its structure.
kable(df1,row.names = FALSE) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "400px")
| . | year |
|---|---|
| FLORIDA DEPARTMENT OF STATE | 2016 |
| DIVISION OF ELECTIONS | 2016 |
| Voting Activity by Ballot Type for 2016 General Election | 2016 |
| Source: Responses by the County Supervisors of Elections to the U.S. Election Assistance Commission’s 2016 Election | 2016 |
| Administration and Voting Survey. (Columns F1b, F1f, F1c+F1d , E1a of the survey) . Data reflects countyâ€specific number of | 2016 |
| persons who voted and attempted to vote by one of the four categories below. For official election results of ballots cast | 2016 |
| and counted for eligible/registered voters, refer to: http://doe.dos.state.fl.us/elections/resultsarchive/index.asp | 2016 |
| Voteâ€byâ€Mail | 2016 |
| County Physical Poll â€Election Day Early Vote Center Provisional Voting backâ€up (Federal Writeâ€domestic/overseas and | 2016 |
| In Absentee Ballot) | 2016 |
| Alachua 50,491 51,097 29,040 993 | 2016 |
| Baker 3,584 7,180 2,003 1 | 2016 |
| Bay 25,579 43,963 18,952 302 | 2016 |
| Bradford 5,095 4,675 2,450 17 | 2016 |
| Brevard 109,623 118,427 92,708 370 | 2016 |
| Broward 213,112 426,592 202,904 1,623 | 2016 |
| Calhoun 2,448 2,692 1,053 21 | 2016 |
| Charlotte 23,477 39,644 35,019 71 | 2016 |
| Citrus 26,556 29,323 25,105 157 | 2016 |
| Clay 33,064 54,533 20,165 101 | 2016 |
| Collier 42,681 74,932 57,159 688 | 2016 |
| Columbia 9,132 14,044 5,884 27 | 2016 |
| DeSoto 4,274 4,452 2,216 57 | 2016 |
| Dixie 3,011 2,319 1,935 19 | 2016 |
| Duval 136,287 227,475 76,053 1,108 | 2016 |
| Escambia 59,696 56,213 39,791 371 | 2016 |
| Flagler 15,191 29,696 13,537 210 | 2016 |
| Franklin 2,264 2,314 1,533 15 | 2016 |
| Gadsden 8,359 9,627 4,317 33 | 2016 |
| Gilchrist 4,233 2,412 1,865 6 | 2016 |
| Glades 2,720 863 831 0 | 2016 |
| Gulf 2,293 3,612 1,456 19 | 2016 |
| Hamilton 2,430 1,702 1,392 11 | 2016 |
| Hardee 3,241 3,618 829 16 | 2016 |
| Hendry 3,597 5,810 1,861 9 | 2016 |
| Hernando 35,004 25,911 34,558 54 | 2016 |
| Highlands 16,149 18,996 11,197 139 | 2016 |
| Hillsborough 173,637 242,184 191,201 2,879 | 2016 |
| Holmes 3,669 2,901 2,044 10 | 2016 |
| Indian River 25,359 33,136 23,196 227 | 2016 |
| Jackson 7,970 9,116 4,179 46 | 2016 |
| Jefferson 3,553 2,883 1,317 7 | 2016 |
| Lafayette 1,653 1,147 624 16 | 2016 |
| Lake 65,105 75,197 32,672 643 | 2016 |
| FLORIDA DEPARTMENT OF STATE | 2016 |
| DIVISION OF ELECTIONS | 2016 |
| Voting Activity by Ballot Type for 2016 General Election | 2016 |
| Source: Responses by the County Supervisors of Elections to the U.S. Election Assistance Commission’s 2016 Election | 2016 |
| Administration and Voting Survey. (Columns F1b, F1f, F1c+F1d , E1a of the survey) . Data reflects countyâ€specific number of | 2016 |
| persons who voted and attempted to vote by one of the four categories below. For official election results of ballots cast | 2016 |
| and counted for eligible/registered voters, refer to: http://doe.dos.state.fl.us/elections/resultsarchive/index.asp | 2016 |
| Voteâ€byâ€Mail | 2016 |
| County Physical Poll â€Early Vote Center domestic/overseas and Election Day Provisional Voting backâ€up (Federal Write†| 2016 |
| In Absentee Ballot) | 2016 |
| Lee 74,735 100,893 158,302 730 | 2016 |
| Leon 55,731 67,518 31,792 365 | 2016 |
| Levy 9,287 5,108 5,182 84 | 2016 |
| Liberty 1,497 1,347 508 1 | 2016 |
| Madison 2,792 4,554 1,261 80 | 2016 |
| Manatee 64,556 48,604 68,494 524 | 2016 |
| Marion 72,776 66,191 38,146 285 | 2016 |
| Martin 23,118 34,038 30,375 83 | 2016 |
| Miamiâ€Dade 217,803 476,001 311,597 2,973 | 2016 |
| Monroe 15,597 13,325 14,398 68 | 2016 |
| Nassau 13,480 23,515 10,294 110 | 2016 |
| Okaloosa 34,348 45,186 23,318 355 | 2016 |
| Okeechobee 5,393 5,879 2,596 24 | 2016 |
| Orange 141,015 255,113 169,031 714 | 2016 |
| Osceola 34,044 63,254 45,228 284 | 2016 |
| Palm Beach 278,030 241,376 152,226 2,330 | 2016 |
| Pasco 100,110 77,027 68,565 343 | 2016 |
| Pinellas 168,563 78,177 255,374 754 | 2016 |
| Polk 116,350 92,598 80,091 1,134 | 2016 |
| Putnam 12,996 13,856 6,621 74 | 2016 |
| Santa Rosa 32,823 37,197 19,143 334 | 2016 |
| Sarasota 75,810 79,817 77,811 612 | 2016 |
| Seminole 56,794 109,262 63,291 629 | 2016 |
| St. Johns 39,611 71,617 27,329 228 | 2016 |
| St. Lucie 44,579 57,289 41,389 135 | 2016 |
| Sumter 12,517 45,545 19,812 13 | 2016 |
| Suwannee 8,006 6,490 4,374 46 | 2016 |
| Taylor 4,011 3,099 2,257 38 | 2016 |
| Union 2,412 2,317 1,025 9 | 2016 |
| Volusia 91,647 97,936 74,939 680 | 2016 |
| Wakulla 5,548 6,824 3,152 8 | 2016 |
| Walton 13,519 13,439 7,141 137 | 2016 |
| Washington 5,050 3,675 2,509 10 | 2016 |
| TOTAL 2,959,085 3,876,753 2,758,617 24,460 | 2016 |
We see that the data frame contains two columns. It appears that due to the format of the pdf file all the columns from the file were compressed into the first column of the data frame along with the page headings. If we were to scroll to the end of the data frame we’d also notice that a summary row was included for each of the ballot types. As the data was not imported as expected we will have to do some extra work to get it into tidy form. For now we’ll focus on df1 and df2 as they are formatted identically. We’ll deal with df3 to df6 later on.
Preparing this data frame will happen in 2 main steps:
Step 1
We’ll first combine df1 and df2 then identify the rows to remove by using non-county keywords
#combine df1 and df2
df_12 <- rbind(df1,df2)
#rename the columns and convert to lower case
names(df_12) <- c("values","year")
df_12$values <- tolower(df_12$values)
#detect non-county strings
to_exclude <- which(str_detect(df_12$values, "vote|division|total|florida|
source|administration|persons|county|
voting|absentee|election"))
#exclude rows that do not contain county data
df_12 <- df_12[-to_exclude, ]
head(df_12)
## values year
## 11 alachua 50,491 51,097 29,040 993 2016
## 12 baker 3,584 7,180 2,003 1 2016
## 13 bay 25,579 43,963 18,952 302 2016
## 14 bradford 5,095 4,675 2,450 17 2016
## 15 brevard 109,623 118,427 92,708 370 2016
## 16 broward 213,112 426,592 202,904 1,623 2016
Looking at the above output we see that the values column contains the county followed by the number of ballots for each vote type. Thus, we can create a county column by extracting only the words from values. Additionally, we’ll create a totals column by removing the commas and extracting the digits portion of the values column.
Step 2
for (i in 1:nrow(df_12)){
df_12$county[i] <- str_extract_all(df_12$values[i], "[[:alpha:]]+") %>% #grab all words
unlist() %>%
paste(collapse = " ")
df_12$totals[i] <- str_replace_all(df_12$values[i], '[[:punct:]]', "") %>% #remove commas
str_extract_all("[[:digit:]]+") %>% #grab all digits
unlist() %>%
paste(collapse = " ")
}
kable(df_12,row.names=FALSE) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "400px")
| values | year | county | totals |
|---|---|---|---|
| alachua 50,491 51,097 29,040 993 | 2016 | alachua | 50491 51097 29040 993 |
| baker 3,584 7,180 2,003 1 | 2016 | baker | 3584 7180 2003 1 |
| bay 25,579 43,963 18,952 302 | 2016 | bay | 25579 43963 18952 302 |
| bradford 5,095 4,675 2,450 17 | 2016 | bradford | 5095 4675 2450 17 |
| brevard 109,623 118,427 92,708 370 | 2016 | brevard | 109623 118427 92708 370 |
| broward 213,112 426,592 202,904 1,623 | 2016 | broward | 213112 426592 202904 1623 |
| calhoun 2,448 2,692 1,053 21 | 2016 | calhoun | 2448 2692 1053 21 |
| charlotte 23,477 39,644 35,019 71 | 2016 | charlotte | 23477 39644 35019 71 |
| citrus 26,556 29,323 25,105 157 | 2016 | citrus | 26556 29323 25105 157 |
| clay 33,064 54,533 20,165 101 | 2016 | clay | 33064 54533 20165 101 |
| collier 42,681 74,932 57,159 688 | 2016 | collier | 42681 74932 57159 688 |
| columbia 9,132 14,044 5,884 27 | 2016 | columbia | 9132 14044 5884 27 |
| desoto 4,274 4,452 2,216 57 | 2016 | desoto | 4274 4452 2216 57 |
| dixie 3,011 2,319 1,935 19 | 2016 | dixie | 3011 2319 1935 19 |
| duval 136,287 227,475 76,053 1,108 | 2016 | duval | 136287 227475 76053 1108 |
| escambia 59,696 56,213 39,791 371 | 2016 | escambia | 59696 56213 39791 371 |
| flagler 15,191 29,696 13,537 210 | 2016 | flagler | 15191 29696 13537 210 |
| franklin 2,264 2,314 1,533 15 | 2016 | franklin | 2264 2314 1533 15 |
| gadsden 8,359 9,627 4,317 33 | 2016 | gadsden | 8359 9627 4317 33 |
| gilchrist 4,233 2,412 1,865 6 | 2016 | gilchrist | 4233 2412 1865 6 |
| glades 2,720 863 831 0 | 2016 | glades | 2720 863 831 0 |
| gulf 2,293 3,612 1,456 19 | 2016 | gulf | 2293 3612 1456 19 |
| hamilton 2,430 1,702 1,392 11 | 2016 | hamilton | 2430 1702 1392 11 |
| hardee 3,241 3,618 829 16 | 2016 | hardee | 3241 3618 829 16 |
| hendry 3,597 5,810 1,861 9 | 2016 | hendry | 3597 5810 1861 9 |
| hernando 35,004 25,911 34,558 54 | 2016 | hernando | 35004 25911 34558 54 |
| highlands 16,149 18,996 11,197 139 | 2016 | highlands | 16149 18996 11197 139 |
| hillsborough 173,637 242,184 191,201 2,879 | 2016 | hillsborough | 173637 242184 191201 2879 |
| holmes 3,669 2,901 2,044 10 | 2016 | holmes | 3669 2901 2044 10 |
| indian river 25,359 33,136 23,196 227 | 2016 | indian river | 25359 33136 23196 227 |
| jackson 7,970 9,116 4,179 46 | 2016 | jackson | 7970 9116 4179 46 |
| jefferson 3,553 2,883 1,317 7 | 2016 | jefferson | 3553 2883 1317 7 |
| lafayette 1,653 1,147 624 16 | 2016 | lafayette | 1653 1147 624 16 |
| lake 65,105 75,197 32,672 643 | 2016 | lake | 65105 75197 32672 643 |
| lee 74,735 100,893 158,302 730 | 2016 | lee | 74735 100893 158302 730 |
| leon 55,731 67,518 31,792 365 | 2016 | leon | 55731 67518 31792 365 |
| levy 9,287 5,108 5,182 84 | 2016 | levy | 9287 5108 5182 84 |
| liberty 1,497 1,347 508 1 | 2016 | liberty | 1497 1347 508 1 |
| madison 2,792 4,554 1,261 80 | 2016 | madison | 2792 4554 1261 80 |
| manatee 64,556 48,604 68,494 524 | 2016 | manatee | 64556 48604 68494 524 |
| marion 72,776 66,191 38,146 285 | 2016 | marion | 72776 66191 38146 285 |
| martin 23,118 34,038 30,375 83 | 2016 | martin | 23118 34038 30375 83 |
| miamiâ€dade 217,803 476,001 311,597 2,973 | 2016 | miamiâ dade | 217803 476001 311597 2973 |
| monroe 15,597 13,325 14,398 68 | 2016 | monroe | 15597 13325 14398 68 |
| nassau 13,480 23,515 10,294 110 | 2016 | nassau | 13480 23515 10294 110 |
| okaloosa 34,348 45,186 23,318 355 | 2016 | okaloosa | 34348 45186 23318 355 |
| okeechobee 5,393 5,879 2,596 24 | 2016 | okeechobee | 5393 5879 2596 24 |
| orange 141,015 255,113 169,031 714 | 2016 | orange | 141015 255113 169031 714 |
| osceola 34,044 63,254 45,228 284 | 2016 | osceola | 34044 63254 45228 284 |
| palm beach 278,030 241,376 152,226 2,330 | 2016 | palm beach | 278030 241376 152226 2330 |
| pasco 100,110 77,027 68,565 343 | 2016 | pasco | 100110 77027 68565 343 |
| pinellas 168,563 78,177 255,374 754 | 2016 | pinellas | 168563 78177 255374 754 |
| polk 116,350 92,598 80,091 1,134 | 2016 | polk | 116350 92598 80091 1134 |
| putnam 12,996 13,856 6,621 74 | 2016 | putnam | 12996 13856 6621 74 |
| santa rosa 32,823 37,197 19,143 334 | 2016 | santa rosa | 32823 37197 19143 334 |
| sarasota 75,810 79,817 77,811 612 | 2016 | sarasota | 75810 79817 77811 612 |
| seminole 56,794 109,262 63,291 629 | 2016 | seminole | 56794 109262 63291 629 |
| st. johns 39,611 71,617 27,329 228 | 2016 | st johns | 39611 71617 27329 228 |
| st. lucie 44,579 57,289 41,389 135 | 2016 | st lucie | 44579 57289 41389 135 |
| sumter 12,517 45,545 19,812 13 | 2016 | sumter | 12517 45545 19812 13 |
| suwannee 8,006 6,490 4,374 46 | 2016 | suwannee | 8006 6490 4374 46 |
| taylor 4,011 3,099 2,257 38 | 2016 | taylor | 4011 3099 2257 38 |
| union 2,412 2,317 1,025 9 | 2016 | union | 2412 2317 1025 9 |
| volusia 91,647 97,936 74,939 680 | 2016 | volusia | 91647 97936 74939 680 |
| wakulla 5,548 6,824 3,152 8 | 2016 | wakulla | 5548 6824 3152 8 |
| walton 13,519 13,439 7,141 137 | 2016 | walton | 13519 13439 7141 137 |
| washington 5,050 3,675 2,509 10 | 2016 | washington | 5050 3675 2509 10 |
| alachua 41,973 17,983 19,091 554 | 2014 | alachua | 41973 17983 19091 554 |
| baker 4,350 2,621 1,653 0 | 2014 | baker | 4350 2621 1653 0 |
| bay 26,898 17,278 14,225 185 | 2014 | bay | 26898 17278 14225 185 |
| bradford 4,997 2,062 1,640 8 | 2014 | bradford | 4997 2062 1640 8 |
| brevard 118,826 33,327 73,765 403 | 2014 | brevard | 118826 33327 73765 403 |
| broward 223,095 129,248 123,820 311 | 2014 | broward | 223095 129248 123820 311 |
| calhoun 2,297 1,258 701 10 | 2014 | calhoun | 2297 1258 701 10 |
| charlotte 23,534 16,463 28,194 38 | 2014 | charlotte | 23534 16463 28194 38 |
| citrus 25,484 13,800 19,777 102 | 2014 | citrus | 25484 13800 19777 102 |
| clay 36,089 19,082 13,699 36 | 2014 | clay | 36089 19082 13699 36 |
| collier 42,921 31,768 40,603 180 | 2014 | collier | 42921 31768 40603 180 |
| columbia 8,658 6,161 3,828 10 | 2014 | columbia | 8658 6161 3828 10 |
| desoto 3,844 2,158 1,622 5 | 2014 | desoto | 3844 2158 1622 5 |
| dixie 2,827 1,196 1,489 7 | 2014 | dixie | 2827 1196 1489 7 |
| duval 143,441 73,604 55,759 1,434 | 2014 | duval | 143441 73604 55759 1434 |
| escambia 53,510 18,421 27,453 85 | 2014 | escambia | 53510 18421 27453 85 |
| flagler 15,407 13,984 9,091 56 | 2014 | flagler | 15407 13984 9091 56 |
| franklin 2,253 998 1,128 11 | 2014 | franklin | 2253 998 1128 11 |
| gadsden 9,099 5,162 3,533 6 | 2014 | gadsden | 9099 5162 3533 6 |
| gilchrist 3,624 1,035 1,405 4 | 2014 | gilchrist | 3624 1035 1405 4 |
| glades 2,057 245 775 0 | 2014 | glades | 2057 245 775 0 |
| gulf 2,288 1,830 1,090 13 | 2014 | gulf | 2288 1830 1090 13 |
| hamilton 2,057 662 1,049 2 | 2014 | hamilton | 2057 662 1049 2 |
| hardee 1,855 1,842 789 5 | 2014 | hardee | 1855 1842 789 5 |
| hendry 3,100 2,281 1,400 7 | 2014 | hendry | 3100 2281 1400 7 |
| hernando 31,957 9,781 23,154 58 | 2014 | hernando | 31957 9781 23154 58 |
| highlands 15,466 9,126 7,900 115 | 2014 | highlands | 15466 9126 7900 115 |
| hillsborough 156,529 88,379 131,659 1,923 | 2014 | hillsborough | 156529 88379 131659 1923 |
| holmes 3,039 1,256 1,512 12 | 2014 | holmes | 3039 1256 1512 12 |
| indian river 24,047 14,749 14,783 115 | 2014 | indian river | 24047 14749 14783 115 |
| jackson 7,349 4,354 3,644 24 | 2014 | jackson | 7349 4354 3644 24 |
| jefferson 3,550 1,587 1,192 27 | 2014 | jefferson | 3550 1587 1192 27 |
| lafayette 1,500 693 480 3 | 2014 | lafayette | 1500 693 480 3 |
| lake 60,379 28,798 24,980 327 | 2014 | lake | 60379 28798 24980 327 |
| lee 70,887 38,077 104,345 761 | 2014 | lee | 70887 38077 104345 761 |
| leon 58,289 27,885 22,939 295 | 2014 | leon | 58289 27885 22939 295 |
| levy 7,222 1,993 4,315 67 | 2014 | levy | 7222 1993 4315 67 |
| liberty 1,514 815 390 4 | 2014 | liberty | 1514 815 390 4 |
| madison 3,250 2,122 1,040 59 | 2014 | madison | 3250 2122 1040 59 |
| manatee 62,249 13,898 45,344 103 | 2014 | manatee | 62249 13898 45344 103 |
| marion 67,987 20,715 33,109 48 | 2014 | marion | 67987 20715 33109 48 |
| martin 23,313 14,615 24,621 41 | 2014 | martin | 23313 14615 24621 41 |
| miamiâ€dade 227,275 116,936 189,324 875 | 2014 | miamiâ dade | 227275 116936 189324 875 |
| monroe 13,633 5,665 9,818 35 | 2014 | monroe | 13633 5665 9818 35 |
| nassau 14,388 8,679 7,888 42 | 2014 | nassau | 14388 8679 7888 42 |
| okaloosa 33,757 14,877 13,409 84 | 2014 | okaloosa | 33757 14877 13409 84 |
| okeechobee 4,665 2,228 2,147 19 | 2014 | okeechobee | 4665 2228 2147 19 |
| orange 130,688 71,962 110,637 379 | 2014 | orange | 130688 71962 110637 379 |
| osceola 31,266 15,866 22,544 152 | 2014 | osceola | 31266 15866 22544 152 |
| palm beach 241,002 93,593 86,952 1,563 | 2014 | palm beach | 241002 93593 86952 1563 |
| pasco 83,556 32,455 46,912 137 | 2014 | pasco | 83556 32455 46912 137 |
| pinellas 133,828 18,987 204,706 292 | 2014 | pinellas | 133828 18987 204706 292 |
| polk 104,849 29,473 60,309 301 | 2014 | polk | 104849 29473 60309 301 |
| putnam 12,596 5,422 5,041 21 | 2014 | putnam | 12596 5422 5041 21 |
| santa rosa 30,964 11,982 10,785 110 | 2014 | santa rosa | 30964 11982 10785 110 |
| sarasota 77,364 30,903 55,160 285 | 2014 | sarasota | 77364 30903 55160 285 |
| seminole 66,202 36,368 45,222 229 | 2014 | seminole | 66202 36368 45222 229 |
| st. johns 44,479 26,845 16,572 70 | 2014 | st johns | 44479 26845 16572 70 |
| st. lucie 41,534 19,632 29,399 111 | 2014 | st lucie | 41534 19632 29399 111 |
| sumter 16,260 25,471 13,941 19 | 2014 | sumter | 16260 25471 13941 19 |
| suwannee 6,947 2,424 3,498 35 | 2014 | suwannee | 6947 2424 3498 35 |
| taylor 3,738 1,206 1,826 23 | 2014 | taylor | 3738 1206 1826 23 |
| union 2,396 1,713 1,055 5 | 2014 | union | 2396 1713 1055 5 |
| volusia 88,743 33,259 56,719 281 | 2014 | volusia | 88743 33259 56719 281 |
| wakulla 5,911 3,176 2,599 6 | 2014 | wakulla | 5911 3176 2599 6 |
| walton 11,495 4,763 4,311 56 | 2014 | walton | 11495 4763 4311 56 |
| washington 4,454 1,846 2,065 9 | 2014 | washington | 4454 1846 2065 9 |
It appears that our process has worked. There is a column for county and also a column for totals. Let’s make sure that the counties are correctly spelled before moving forward.
unique(df_12$county)
## [1] "alachua" "baker" "bay" "bradford"
## [5] "brevard" "broward" "calhoun" "charlotte"
## [9] "citrus" "clay" "collier" "columbia"
## [13] "desoto" "dixie" "duval" "escambia"
## [17] "flagler" "franklin" "gadsden" "gilchrist"
## [21] "glades" "gulf" "hamilton" "hardee"
## [25] "hendry" "hernando" "highlands" "hillsborough"
## [29] "holmes" "indian river" "jackson" "jefferson"
## [33] "lafayette" "lake" "lee" "leon"
## [37] "levy" "liberty" "madison" "manatee"
## [41] "marion" "martin" "miamiâ dade" "monroe"
## [45] "nassau" "okaloosa" "okeechobee" "orange"
## [49] "osceola" "palm beach" "pasco" "pinellas"
## [53] "polk" "putnam" "santa rosa" "sarasota"
## [57] "seminole" "st johns" "st lucie" "sumter"
## [61] "suwannee" "taylor" "union" "volusia"
## [65] "wakulla" "walton" "washington"
Every county looks fine except for Miami-Dade. We’ll rename this county then separate the totals column into election_day, early_voting and absentee columns.
df_12 <- df_12 %>%
mutate(county = gsub("miami? dade", "miami-dade", county)) %>% #correct miami-dade spelling
separate(totals, c("election_day", "early_voting", "absentee", "provisional")) %>% #separate columns
select(-c(values, provisional)) #remove unnecessary columns
kable(df_12) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "400px")
| year | county | election_day | early_voting | absentee |
|---|---|---|---|---|
| 2016 | alachua | 50491 | 51097 | 29040 |
| 2016 | baker | 3584 | 7180 | 2003 |
| 2016 | bay | 25579 | 43963 | 18952 |
| 2016 | bradford | 5095 | 4675 | 2450 |
| 2016 | brevard | 109623 | 118427 | 92708 |
| 2016 | broward | 213112 | 426592 | 202904 |
| 2016 | calhoun | 2448 | 2692 | 1053 |
| 2016 | charlotte | 23477 | 39644 | 35019 |
| 2016 | citrus | 26556 | 29323 | 25105 |
| 2016 | clay | 33064 | 54533 | 20165 |
| 2016 | collier | 42681 | 74932 | 57159 |
| 2016 | columbia | 9132 | 14044 | 5884 |
| 2016 | desoto | 4274 | 4452 | 2216 |
| 2016 | dixie | 3011 | 2319 | 1935 |
| 2016 | duval | 136287 | 227475 | 76053 |
| 2016 | escambia | 59696 | 56213 | 39791 |
| 2016 | flagler | 15191 | 29696 | 13537 |
| 2016 | franklin | 2264 | 2314 | 1533 |
| 2016 | gadsden | 8359 | 9627 | 4317 |
| 2016 | gilchrist | 4233 | 2412 | 1865 |
| 2016 | glades | 2720 | 863 | 831 |
| 2016 | gulf | 2293 | 3612 | 1456 |
| 2016 | hamilton | 2430 | 1702 | 1392 |
| 2016 | hardee | 3241 | 3618 | 829 |
| 2016 | hendry | 3597 | 5810 | 1861 |
| 2016 | hernando | 35004 | 25911 | 34558 |
| 2016 | highlands | 16149 | 18996 | 11197 |
| 2016 | hillsborough | 173637 | 242184 | 191201 |
| 2016 | holmes | 3669 | 2901 | 2044 |
| 2016 | indian river | 25359 | 33136 | 23196 |
| 2016 | jackson | 7970 | 9116 | 4179 |
| 2016 | jefferson | 3553 | 2883 | 1317 |
| 2016 | lafayette | 1653 | 1147 | 624 |
| 2016 | lake | 65105 | 75197 | 32672 |
| 2016 | lee | 74735 | 100893 | 158302 |
| 2016 | leon | 55731 | 67518 | 31792 |
| 2016 | levy | 9287 | 5108 | 5182 |
| 2016 | liberty | 1497 | 1347 | 508 |
| 2016 | madison | 2792 | 4554 | 1261 |
| 2016 | manatee | 64556 | 48604 | 68494 |
| 2016 | marion | 72776 | 66191 | 38146 |
| 2016 | martin | 23118 | 34038 | 30375 |
| 2016 | miamiâ dade | 217803 | 476001 | 311597 |
| 2016 | monroe | 15597 | 13325 | 14398 |
| 2016 | nassau | 13480 | 23515 | 10294 |
| 2016 | okaloosa | 34348 | 45186 | 23318 |
| 2016 | okeechobee | 5393 | 5879 | 2596 |
| 2016 | orange | 141015 | 255113 | 169031 |
| 2016 | osceola | 34044 | 63254 | 45228 |
| 2016 | palm beach | 278030 | 241376 | 152226 |
| 2016 | pasco | 100110 | 77027 | 68565 |
| 2016 | pinellas | 168563 | 78177 | 255374 |
| 2016 | polk | 116350 | 92598 | 80091 |
| 2016 | putnam | 12996 | 13856 | 6621 |
| 2016 | santa rosa | 32823 | 37197 | 19143 |
| 2016 | sarasota | 75810 | 79817 | 77811 |
| 2016 | seminole | 56794 | 109262 | 63291 |
| 2016 | st johns | 39611 | 71617 | 27329 |
| 2016 | st lucie | 44579 | 57289 | 41389 |
| 2016 | sumter | 12517 | 45545 | 19812 |
| 2016 | suwannee | 8006 | 6490 | 4374 |
| 2016 | taylor | 4011 | 3099 | 2257 |
| 2016 | union | 2412 | 2317 | 1025 |
| 2016 | volusia | 91647 | 97936 | 74939 |
| 2016 | wakulla | 5548 | 6824 | 3152 |
| 2016 | walton | 13519 | 13439 | 7141 |
| 2016 | washington | 5050 | 3675 | 2509 |
| 2014 | alachua | 41973 | 17983 | 19091 |
| 2014 | baker | 4350 | 2621 | 1653 |
| 2014 | bay | 26898 | 17278 | 14225 |
| 2014 | bradford | 4997 | 2062 | 1640 |
| 2014 | brevard | 118826 | 33327 | 73765 |
| 2014 | broward | 223095 | 129248 | 123820 |
| 2014 | calhoun | 2297 | 1258 | 701 |
| 2014 | charlotte | 23534 | 16463 | 28194 |
| 2014 | citrus | 25484 | 13800 | 19777 |
| 2014 | clay | 36089 | 19082 | 13699 |
| 2014 | collier | 42921 | 31768 | 40603 |
| 2014 | columbia | 8658 | 6161 | 3828 |
| 2014 | desoto | 3844 | 2158 | 1622 |
| 2014 | dixie | 2827 | 1196 | 1489 |
| 2014 | duval | 143441 | 73604 | 55759 |
| 2014 | escambia | 53510 | 18421 | 27453 |
| 2014 | flagler | 15407 | 13984 | 9091 |
| 2014 | franklin | 2253 | 998 | 1128 |
| 2014 | gadsden | 9099 | 5162 | 3533 |
| 2014 | gilchrist | 3624 | 1035 | 1405 |
| 2014 | glades | 2057 | 245 | 775 |
| 2014 | gulf | 2288 | 1830 | 1090 |
| 2014 | hamilton | 2057 | 662 | 1049 |
| 2014 | hardee | 1855 | 1842 | 789 |
| 2014 | hendry | 3100 | 2281 | 1400 |
| 2014 | hernando | 31957 | 9781 | 23154 |
| 2014 | highlands | 15466 | 9126 | 7900 |
| 2014 | hillsborough | 156529 | 88379 | 131659 |
| 2014 | holmes | 3039 | 1256 | 1512 |
| 2014 | indian river | 24047 | 14749 | 14783 |
| 2014 | jackson | 7349 | 4354 | 3644 |
| 2014 | jefferson | 3550 | 1587 | 1192 |
| 2014 | lafayette | 1500 | 693 | 480 |
| 2014 | lake | 60379 | 28798 | 24980 |
| 2014 | lee | 70887 | 38077 | 104345 |
| 2014 | leon | 58289 | 27885 | 22939 |
| 2014 | levy | 7222 | 1993 | 4315 |
| 2014 | liberty | 1514 | 815 | 390 |
| 2014 | madison | 3250 | 2122 | 1040 |
| 2014 | manatee | 62249 | 13898 | 45344 |
| 2014 | marion | 67987 | 20715 | 33109 |
| 2014 | martin | 23313 | 14615 | 24621 |
| 2014 | miamiâ dade | 227275 | 116936 | 189324 |
| 2014 | monroe | 13633 | 5665 | 9818 |
| 2014 | nassau | 14388 | 8679 | 7888 |
| 2014 | okaloosa | 33757 | 14877 | 13409 |
| 2014 | okeechobee | 4665 | 2228 | 2147 |
| 2014 | orange | 130688 | 71962 | 110637 |
| 2014 | osceola | 31266 | 15866 | 22544 |
| 2014 | palm beach | 241002 | 93593 | 86952 |
| 2014 | pasco | 83556 | 32455 | 46912 |
| 2014 | pinellas | 133828 | 18987 | 204706 |
| 2014 | polk | 104849 | 29473 | 60309 |
| 2014 | putnam | 12596 | 5422 | 5041 |
| 2014 | santa rosa | 30964 | 11982 | 10785 |
| 2014 | sarasota | 77364 | 30903 | 55160 |
| 2014 | seminole | 66202 | 36368 | 45222 |
| 2014 | st johns | 44479 | 26845 | 16572 |
| 2014 | st lucie | 41534 | 19632 | 29399 |
| 2014 | sumter | 16260 | 25471 | 13941 |
| 2014 | suwannee | 6947 | 2424 | 3498 |
| 2014 | taylor | 3738 | 1206 | 1826 |
| 2014 | union | 2396 | 1713 | 1055 |
| 2014 | volusia | 88743 | 33259 | 56719 |
| 2014 | wakulla | 5911 | 3176 | 2599 |
| 2014 | walton | 11495 | 4763 | 4311 |
| 2014 | washington | 4454 | 1846 | 2065 |
One final check of the data reveals everything is correct! Now let’s proceed to the 2012 data.
As data for 2012 was not extracted with the other data we will first have to extract it.
df7 <- extract_tables(paste0("http://dos.myflorida.com", link_2012))
glimpse(df7)
## List of 2
## $ : chr [1:42, 1] "FLORIDA DEPARTMENT OF STATE" "Division of Elections" "Ballots-by-Type Activity for 2012 General Election" "" ...
## $ : chr [1:38, 1:5] "Jackson" "Jefferson" "Lafayette" "Lake" ...
It appears that df7 was extracted as a list containing two elements.The first element contains contains 42 rows and 1 column which is the same structure of the values column from df_12. The second element of the list contains 38 rows and 5 columns. Note that both elements of the list contain data for 2012 but the data was split into two different parts due to the format of the pdf file, i.e. the first page contained headers and a link while the second page only contained a table of data. We’ll convert each element to a data frame then combine the two.
Let’s have a look at the data to better understand it. The left portion represents the data from the first list element while the right portion represents the second. We see how the headings on the first page of the pdf have completely messed with the structure of the first table. In fact, the data in the second list element is practically in tidy format, It just needs to be converted to a data frame and given column headings.
kable(df7) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "400px")
|
|
#extract first list element
first_table <- df7[[1]]
first_table <- first_table[13:42, ] %>% data.frame() #get necesssary rows and convert to df
names(first_table) <- 'col1'
#tidy the data
first_table <- first_table %>%
#combine Indian River to make separating on space easy
mutate(col1 = str_replace_all(col1,"Indian River", "IndianRiver")) %>%
separate(col1, c("county", "election_day","early_voting", "absentee","provisional")) %>%
mutate(county = str_replace_all(county,"IndianRiver", "Indian River"))
#extract second list element
second_table <- data.frame(df7[[2]])
names(second_table) <- c("county", "election_day","early_voting", "absentee","provisional")
#combine the two tables
df7 <- rbind(first_table, second_table)
df7$year <- '2012' #create year column
df7$provisional <- NULL #drop provisional column
df7 <- df7[-68, ] #remove row 68 as it contained the totals
kable(df7) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "400px")
| county | election_day | early_voting | absentee | year |
|---|---|---|---|---|
| Alachua | 54204 | 39271 | 27393 | 2012 |
| Baker | 4101 | 3601 | 1880 | 2012 |
| Bay | 29255 | 33857 | 17797 | 2012 |
| Bradford | 5869 | 3958 | 1887 | 2012 |
| Brevard | 144657 | 51449 | 91170 | 2012 |
| Broward | 341637 | 245952 | 171785 | 2012 |
| Calhoun | 2822 | 2273 | 1112 | 2012 |
| Charlotte | 24979 | 26688 | 33653 | 2012 |
| Citrus | 26650 | 24686 | 23031 | 2012 |
| Clay | 36171 | 41692 | 18842 | 2012 |
| Collier | 49089 | 52290 | 48646 | 2012 |
| Columbia | 9271 | 13177 | 4829 | 2012 |
| DeSoto | 4694 | 3138 | 2107 | 2012 |
| Dixie | 3776 | 1262 | 1959 | 2012 |
| Duval | 157365 | 174420 | 82913 | 2012 |
| Escambia | 67630 | 42861 | 39647 | 2012 |
| Flagler | 20070 | 19906 | 10651 | 2012 |
| Franklin | 2618 | 1470 | 1437 | 2012 |
| Gadsden | 8895 | 9598 | 4123 | 2012 |
| Gilchrist | 4249 | 1935 | 1789 | 2012 |
| Glades | 2541 | 498 | 943 | 2012 |
| Gulf | 2591 | 3046 | 1517 | 2012 |
| Hamilton | 2659 | 1442 | 1355 | 2012 |
| Hardee | 3346 | 2846 | 1083 | 2012 |
| Hendry | 4537 | 3846 | 1933 | 2012 |
| Hernando | 40030 | 12752 | 27806 | 2012 |
| Highlands | 16103 | 16991 | 9806 | 2012 |
| Hillsborough | 205283 | 166990 | 171666 | 2012 |
| Holmes | 4091 | 2391 | 1892 | 2012 |
| Indian River | 23841 | 28646 | 19630 | 2012 |
| Jackson | 8913 | 7777 | 4371 | 2012 |
| Jefferson | 3904 | 2481 | 1454 | 2012 |
| Lafayette | 1892 | 962 | 589 | 2012 |
| Lake | 63751 | 57612 | 30793 | 2012 |
| Lee | 125346 | 51932 | 89638 | 2012 |
| Leon | 72848 | 45110 | 30316 | 2012 |
| Levy | 9816 | 3653 | 5108 | 2012 |
| Liberty | 1617 | 1210 | 496 | 2012 |
| Madison | 3608 | 3669 | 1411 | 2012 |
| Manatee | 91799 | 16357 | 45880 | 2012 |
| Marion | 80570 | 43209 | 39052 | 2012 |
| Martin | 21739 | 26335 | 31425 | 2012 |
| Miami-Dade | 406841 | 235916 | 247285 | 2012 |
| Monroe | 14491 | 11106 | 13778 | 2012 |
| Nassau | 11658 | 18339 | 10772 | 2012 |
| Okaloosa | 40005 | 32847 | 22496 | 2012 |
| Okeechobee | 5895 | 3525 | 2975 | 2012 |
| Orange | 205018 | 127583 | 137125 | 2012 |
| Osceola | 41532 | 41736 | 26484 | 2012 |
| Palm Beach | 351600 | 124833 | 130532 | 2012 |
| Pasco | 94321 | 60974 | 59648 | 2012 |
| Pinellas | 172605 | 39569 | 250113 | 2012 |
| Polk | 128094 | 55120 | 66392 | 2012 |
| Putnam | 14169 | 10622 | 6743 | 2012 |
| Santa Rosa | 37931 | 22639 | 16648 | 2012 |
| Sarasota | 78328 | 55473 | 73876 | 2012 |
| Seminole | 89594 | 66721 | 53189 | 2012 |
| St. Johns | 39631 | 52565 | 23515 | 2012 |
| St. Lucie | 48960 | 37353 | 37041 | 2012 |
| Sumter | 16559 | 30269 | 14028 | 2012 |
| Suwannee | 8941 | 4593 | 4276 | 2012 |
| Taylor | 4490 | 2518 | 2148 | 2012 |
| Union | 2645 | 1662 | 1139 | 2012 |
| Volusia | 110545 | 61097 | 63821 | 2012 |
| Wakulla | 5940 | 5431 | 3321 | 2012 |
| Walton | 13206 | 10088 | 5319 | 2012 |
| Washington | 5120 | 3279 | 2636 | 2012 |
Okay, everything looks great so far. Let’s proceed to the 2004 to 2010 data.
Let’s begin by combining df3, df4, df5 and df6 and then inspecting data frame.
df3_6 <- rbind(df3,df4,df5,df6)
kable(df3_6) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "400px")
| X1 | X2 | X3 | X4 | year |
|---|---|---|---|---|
| COUNTY | ELECTION DAY BALLOTS CAST | EARLY VOTING BALLOTS CAST | ABSENTEE BALLOTS CAST | 2010 |
| 2010 | ||||
| ALACHUA | 44570 | 16549 | 13567 | 2010 |
| BAKER | 4312 | 2748 | 1013 | 2010 |
| BAY | 27101 | 17978 | 10148 | 2010 |
| BRADFORD | 5298 | 1746 | 1145 | 2010 |
| BREVARD | 125316 | 23387 | 49967 | 2010 |
| BROWARD | 264978 | 85768 | 76023 | 2010 |
| CALHOUN | 2894 | 1008 | 469 | 2010 |
| CHARLOTTE | 23093 | 17514 | 19749 | 2010 |
| CITRUS | 27442 | 15141 | 13133 | 2010 |
| CLAY | 37780 | 18424 | 7684 | 2010 |
| COLLIER | 47001 | 29282 | 28290 | 2010 |
| COLUMBIA | 9481 | 6472 | 3149 | 2010 |
| DESOTO | 4008 | 1792 | 1227 | 2010 |
| DIXIE | 3380 | 910 | 1117 | 2010 |
| DUVAL | 160334 | 52279 | 48260 | 2010 |
| ESCAMBIA | 59410 | 16991 | 18941 | 2010 |
| FLAGLER | 18203 | 10713 | 4906 | 2010 |
| FRANKLIN | 2204 | 1035 | 877 | 2010 |
| GADSDEN | 10044 | 4597 | 2136 | 2010 |
| GILCHRIST | 3607 | 906 | 952 | 2010 |
| GLADES | 2075 | 176 | 555 | 2010 |
| GULF | 2420 | 1867 | 807 | 2010 |
| HAMILTON | 2471 | 701 | 897 | 2010 |
| HARDEE | 2873 | 1796 | 655 | 2010 |
| HENDRY | 4046 | 1657 | 940 | 2010 |
| HERNANDO | 36729 | 10313 | 12501 | 2010 |
| HIGHLANDS | 15830 | 10237 | 4351 | 2010 |
| HILLSBOROUGH | 178323 | 77351 | 65167 | 2010 |
| HOLMES | 3624 | 1322 | 1260 | 2010 |
| INDIAN RIVER | 24286 | 12582 | 11477 | 2010 |
| JACKSON | 8293 | 4157 | 2727 | 2010 |
| JEFFERSON | 3773 | 1673 | 828 | 2010 |
| LAFAYETTE | 1614 | 619 | 324 | 2010 |
| LAKE | 56958 | 26280 | 20691 | 2010 |
| LEE | 101711 | 36876 | 44754 | 2010 |
| LEON | 63435 | 18594 | 18412 | 2010 |
| LEVY | 7901 | 1671 | 3343 | 2010 |
| LIBERTY | 1468 | 517 | 242 | 2010 |
| MADISON | 3691 | 1533 | 1087 | 2010 |
| MANATEE | 73158 | 9765 | 24667 | 2010 |
| MARION | 71925 | 19737 | 24821 | 2010 |
| MARTIN | 25818 | 12070 | 17435 | 2010 |
| MIAMI-DADE | 268211 | 96884 | 134598 | 2010 |
| MONROE | 14469 | 4058 | 7971 | 2010 |
| COUNTY | ELECTION DAY BALLOTS CAST | EARLY VOTING BALLOTS CAST | ABSENTEE BALLOTS CAST | 2010 |
| 2010 | ||||
| NASSAU | 13870 | 8128 | 5120 | 2010 |
| OKALOOSA | 36520 | 13482 | 9058 | 2010 |
| OKEECHOBEE | 4685 | 2175 | 1511 | 2010 |
| ORANGE | 141692 | 51769 | 82355 | 2010 |
| OSCEOLA | 29050 | 16394 | 8985 | 2010 |
| PALM BEACH | 265127 | 62640 | 61502 | 2010 |
| PASCO | 85976 | 31124 | 21828 | 2010 |
| PINELLAS | 140751 | 10068 | 158668 | 2010 |
| POLK | 100149 | 22754 | 39937 | 2010 |
| PUTNAM | 14510 | 3841 | 3378 | 2010 |
| SANTA ROSA | 29863 | 11952 | 7369 | 2010 |
| SARASOTA | 82013 | 38276 | 27551 | 2010 |
| SEMINOLE | 72165 | 33205 | 29481 | 2010 |
| ST. JOHNS | 39285 | 24443 | 11370 | 2010 |
| ST. LUCIE | 41522 | 15259 | 20083 | 2010 |
| SUMTER | 14824 | 21694 | 6100 | 2010 |
| SUWANNEE | 8339 | 2684 | 2603 | 2010 |
| TAYLOR | 4103 | 1222 | 1388 | 2010 |
| UNION | 2378 | 501 | 796 | 2010 |
| VOLUSIA | 98368 | 29520 | 32621 | 2010 |
| WAKULLA | 6053 | 2997 | 2056 | 2010 |
| WALTON | 10816 | 5246 | 3090 | 2010 |
| WASHINGTON | 4771 | 1656 | 1743 | 2010 |
| 2010 | ||||
| TOTALS | 3,102,388 | 1,088,706 | 1,241,856 | 2010 |
| COUNTY | ELECTION DAY BALLOTS CAST | EARLY VOTING BALLOTS CAST | ABSENTEE BALLOTS CAST | 2008 |
| 2008 | ||||
| Alachua | 49,077 | 53,556 | 22,003 | 2008 |
| Baker | 4,193 | 5,328 | 1,562 | 2008 |
| Bay | 33,385 | 29,404 | 16,038 | 2008 |
| Bradford | 5,525 | 4,430 | 1,740 | 2008 |
| Brevard | 168,547 | 51,879 | 63,479 | 2008 |
| Broward | 345,855 | 252,061 | 132,268 | 2008 |
| Calhoun | 3,392 | 2,154 | 712 | 2008 |
| Charlotte | 23,461 | 33,580 | 28,536 | 2008 |
| Citrus | 30,002 | 28,011 | 18,400 | 2008 |
| Clay | 40,397 | 39,141 | 12,605 | 2008 |
| Collier | 53,022 | 52,734 | 36,840 | 2008 |
| Columbia | 9,917 | 12,945 | 5,190 | 2008 |
| DeSoto | 4,520 | 3,761 | 1,888 | 2008 |
| Dixie | 4,052 | 1,742 | 1,544 | 2008 |
| Duval | 151,203 | 183,410 | 76,202 | 2008 |
| Escambia | 66,512 | 49,180 | 29,991 | 2008 |
| Flagler | 18,671 | 22,457 | 7,949 | 2008 |
| Franklin | 2,629 | 1,943 | 1,469 | 2008 |
| Gadsden | 8,829 | 10,559 | 3,064 | 2008 |
| Gilchrist | 4,242 | 2,192 | 1,360 | 2008 |
| Glades | 2,785 | 595 | 894 | 2008 |
| Gulf | 2,608 | 3,321 | 1,251 | 2008 |
| Hamilton | 2,893 | 1,542 | 1,129 | 2008 |
| Hardee | 3,202 | 3,271 | 973 | 2008 |
| Hendry | 4,402 | 4,872 | 1,652 | 2008 |
| Hernando | 45,231 | 23,093 | 19,875 | 2008 |
| Highlands | 16,203 | 20,806 | 8,088 | 2008 |
| Hillsborough | 246,303 | 146,518 | 114,765 | 2008 |
| Holmes | 4,315 | 2,587 | 1,733 | 2008 |
| Indian River | 25,670 | 27,933 | 16,792 | 2008 |
| Jackson | 11,271 | 6,967 | 3,323 | 2008 |
| Jefferson | 3,800 | 3,007 | 1,199 | 2008 |
| Lafayette | 1,801 | 1,041 | 530 | 2008 |
| Lake | 60,393 | 57,528 | 29,446 | 2008 |
| Lee | 123,872 | 64,081 | 82,030 | 2008 |
| Leon | 77,194 | 42,432 | 27,810 | 2008 |
| Levy | 10,146 | 4,254 | 4,406 | 2008 |
| Liberty | 1,850 | 1,028 | 432 | 2008 |
| Madison | 3,980 | 3,454 | 1,471 | 2008 |
| Manatee | 96,193 | 20,177 | 35,515 | 2008 |
| COUNTY | ELECTION DAY BALLOTS CAST | EARLY VOTING BALLOTS CAST | ABSENTEE BALLOTS CAST | 2008 |
| 2008 | ||||
| Marion | 81,455 | 47,683 | 33,419 | 2008 |
| Martin | 24,844 | 28,285 | 25,603 | 2008 |
| Miami-Dade | 372,299 | 326,358 | 170,820 | 2008 |
| Monroe | 14,113 | 13,608 | 12,043 | 2008 |
| Nassau | 10,837 | 18,976 | 8,231 | 2008 |
| Okaloosa | 41,232 | 33,340 | 14,841 | 2008 |
| Okeechobee | 5,531 | 4,782 | 2,536 | 2008 |
| Orange | 201,361 | 145,278 | 119,288 | 2008 |
| Osceola | 43,695 | 41,305 | 14,641 | 2008 |
| Palm Beach | 323,264 | 144,545 | 124,218 | 2008 |
| Pasco | 109,789 | 68,858 | 37,143 | 2008 |
| Pinellas | 237,430 | 46,385 | 180,873 | 2008 |
| Polk | 131,739 | 60,064 | 52,289 | 2008 |
| Putnam | 17,663 | 9,952 | 5,468 | 2008 |
| Santa Rosa | 24,447 | 26,275 | 19,626 | 2008 |
| Sarasota | 82,880 | 70,693 | 30,732 | 2008 |
| Seminole | 85,801 | 77,781 | 11,398 | 2008 |
| St. Johns | 35,905 | 49,241 | 53,087 | 2008 |
| St. Lucie | 47,018 | 43,691 | 42,092 | 2008 |
| Sumter | 11,678 | 28,587 | 8,822 | 2008 |
| Suwannee | 9,079 | 5,246 | 3,399 | 2008 |
| Taylor | 4,576 | 2,835 | 1,986 | 2008 |
| Union | 2,674 | 1,386 | 1,282 | 2008 |
| Volusia | 124,715 | 68,371 | 50,724 | 2008 |
| Wakulla | 5,494 | 6,386 | 2,519 | 2008 |
| Walton | 12,435 | 9,649 | 4,891 | 2008 |
| Washington | 5,687 | 3,138 | 2,377 | 2008 |
| 2008 | ||||
| TOTALS | 3,839,184 | 2,661,672 | 1,850,502 | 2008 |
| COUNTY | ELECTION DAY BALLOTS CAST | EARLY VOTING BALLOTS CAST | ABSENTEE BALLOTS CAST | 2006 |
| 2006 | ||||
| Alachua | 47,367 | 13,418 | 10,456 | 2006 |
| Baker | 4,033 | 1,608 | 737 | 2006 |
| Bay | 27,548 | 13,488 | 7,607 | 2006 |
| Bradford | 5,214 | 1,121 | 908 | 2006 |
| Brevard | 185,074 | 12,725 | 28,085 | 2006 |
| Broward | 288,224 | 78,003 | 44,872 | 2006 |
| Calhoun | 2,573 | 522 | 317 | 2006 |
| Charlotte | 28,900 | 17,096 | 9,784 | 2006 |
| Citrus | 32,033 | 9,968 | 10,035 | 2006 |
| Clay | 37,442 | 8,736 | 5,797 | 2006 |
| Collier | 48,059 | 20,436 | 17,847 | 2006 |
| Columbia | 10,128 | 3,397 | 2,226 | 2006 |
| DeSoto | 4,371 | 1,192 | 1,109 | 2006 |
| Dixie | 3,669 | 799 | 704 | 2006 |
| Duval | 173,925 | 34,193 | 30,046 | 2006 |
| Escambia | 62,294 | 8,392 | 17,209 | 2006 |
| Flagler | 19,060 | 7,488 | 3,312 | 2006 |
| Franklin | 2,468 | 763 | 764 | 2006 |
| Gadsden | 12,552 | 2,603 | 1,640 | 2006 |
| Gilchrist | 3,867 | 908 | 673 | 2006 |
| Glades | 2,304 | 140 | 547 | 2006 |
| Gulf | 3,068 | 1,242 | 627 | 2006 |
| Hamilton | 2,515 | 382 | 755 | 2006 |
| Hardee | 2,828 | 1,218 | 516 | 2006 |
| Hendry | 3,639 | 1,120 | 753 | 2006 |
| Hernando | 40,672 | 8,206 | 8,410 | 2006 |
| Highlands | 17,599 | 8,948 | 4,043 | 2006 |
| Hillsborough | 184,379 | 54,051 | 53,072 | 2006 |
| Holmes | 2,419 | 1,236 | 930 | 2006 |
| Indian River | 27,204 | 10,538 | 6,181 | 2006 |
| Jackson | 9,707 | 2,008 | 1,592 | 2006 |
| Jefferson | 3,905 | 1,336 | 837 | 2006 |
| Lafayette | 1,444 | 403 | 370 | 2006 |
| Lake | 58,530 | 16,744 | 11,845 | 2006 |
| Lee | 108,042 | 21,914 | 25,491 | 2006 |
| Leon | 68,544 | 7,689 | 14,763 | 2006 |
| Levy | 8,474 | 778 | 2,215 | 2006 |
| Liberty | 1,840 | 377 | 204 | 2006 |
| Madison | 4,777 | 1,234 | 611 | 2006 |
| Manatee | 77,538 | 7,078 | 16,044 | 2006 |
| COUNTY | ELECTION DAY BALLOTS CAST | EARLY VOTING BALLOTS CAST | ABSENTEE BALLOTS CAST | 2006 |
| 2006 | ||||
| Marion | 75,043 | 10,736 | 16,046 | 2006 |
| Martin | 28,298 | 15,946 | 9,217 | 2006 |
| Miami-Dade | 279,910 | 66,190 | 65,955 | 2006 |
| Monroe | 14,983 | 2,700 | 7,012 | 2006 |
| Nassau | 14,101 | 5,527 | 3,061 | 2006 |
| Okaloosa | 37,869 | 9,171 | 9,720 | 2006 |
| Okeechobee | 5,408 | 1,712 | 1,261 | 2006 |
| Orange | 149,732 | 37,052 | 36,232 | 2006 |
| Osceola | 33,752 | 5,806 | 5,429 | 2006 |
| Palm Beach | 255,922 | 59,720 | 54,954 | 2006 |
| Pasco | 94,458 | 19,020 | 16,814 | 2006 |
| Pinellas | 207,960 | 35,338 | 50,620 | 2006 |
| Polk | 100,360 | 13,753 | 28,910 | 2006 |
| Putnam | 15,672 | 1,807 | 2,495 | 2006 |
| Santa Rosa | 28,637 | 7,976 | 6,108 | 2006 |
| Sarasota | 88,927 | 30,832 | 22,831 | 2006 |
| Seminole | 73,999 | 21,450 | 15,314 | 2006 |
| St. Johns | 41,314 | 11,805 | 8,187 | 2006 |
| St. Lucie | 45,553 | 13,132 | 12,124 | 2006 |
| Sumter | 12,193 | 15,266 | 3,137 | 2006 |
| Suwannee | 8,167 | 1,101 | 1,840 | 2006 |
| Taylor | 5,665 | 833 | 1,013 | 2006 |
| Union | 2,153 | 154 | 627 | 2006 |
| Volusia | 111,432 | 19,122 | 23,069 | 2006 |
| Wakulla | 9,674 | 2,335 | 1,213 | 2006 |
| Walton | 10,869 | 3,110 | 2,437 | 2006 |
| Washington | 4,959 | 1,239 | 1,202 | 2006 |
| 2006 | ||||
| TOTALS | 3,385,239 | 796,331 | 750,762 | 2006 |
| COUNTY | ELECTION DAY BALLOTS CAST | EARLY VOTING BALLOTS CAST | ABSENTEE BALLOTS CAST | 2004 |
| 2004 | ||||
| Alachua | 76,381 | 16,389 | 18,956 | 2004 |
| Baker | 5,879 | 2,843 | 1,333 | 2004 |
| Bay | 37,981 | 18,572 | 18,799 | 2004 |
| Bradford | 6,871 | 2,429 | 1,614 | 2004 |
| Brevard | 185,305 | 30,564 | 50,261 | 2004 |
| Broward | 431,488 | 176,743 | 100,388 | 2004 |
| Calhoun | 3,718 | 1,447 | 850 | 2004 |
| Charlotte | 46,741 | 13,659 | 19,803 | 2004 |
| Citrus | 34,177 | 20,117 | 15,527 | 2004 |
| Clay | 57,209 | 6,770 | 17,786 | 2004 |
| Collier | 57,323 | 44,155 | 27,984 | 2004 |
| Columbia | 14,490 | 7,339 | 3,262 | 2004 |
| Miami-Dade | 435,887 | 244,022 | 100,780 | 2004 |
| Desoto | 5,393 | 2,310 | 1,831 | 2004 |
| Dixie | 4,462 | 1,091 | 926 | 2004 |
| Duval | 258,746 | 58,693 | 64,010 | 2004 |
| Escambia | 95,647 | 12,807 | 35,760 | 2004 |
| Flagler | 23,694 | 8,873 | 6,008 | 2004 |
| Franklin | 3,111 | 1,586 | 1,285 | 2004 |
| Gadsden | 12,024 | 5,467 | 3,629 | 2004 |
| Gilchrist | 4,514 | 1,616 | 922 | 2004 |
| Glades | 3,101 | 345 | 750 | 2004 |
| Gulf | 4,099 | 1,725 | 1,489 | 2004 |
| Hamilton | 3,147 | 937 | 1,060 | 2004 |
| Hardee | 3,400 | 2,831 | 1,073 | 2004 |
| Hendry | 5,817 | 2,481 | 1,537 | 2004 |
| Hernando | 50,942 | 13,942 | 15,920 | 2004 |
| Highlands | 23,351 | 11,385 | 7,168 | 2004 |
| Hillsborough | 313,930 | 86,642 | 64,857 | 2004 |
| Holmes | 5,418 | 1,987 | 970 | 2004 |
| Indian River | 32,015 | 17,328 | 12,210 | 2004 |
| Jackson | 12,655 | 4,677 | 2,613 | 2004 |
| Jefferson | 3,979 | 2,486 | 1,040 | 2004 |
| Lafayette | 1,935 | 873 | 548 | 2004 |
| Lake | 68,121 | 32,331 | 24,063 | 2004 |
| Lee | 161,953 | 32,182 | 47,669 | 2004 |
| Leon | 95,397 | 17,974 | 23,000 | 2004 |
| Levy | 11,748 | 2,078 | 2,939 | 2004 |
| Liberty | 1,823 | 864 | 366 | 2004 |
| Madison | 4,905 | 2,446 | 1,007 | 2004 |
| Manatee | 106,917 | 10,274 | 27,052 | 2004 |
| Monroe | 19,706 | 8,918 | 11,091 | 2004 |
| Marion | 99,628 | 14,799 | 25,938 | 2004 |
| Martin | 43,892 | 12,243 | 16,746 | 2004 |
| COUNTY | ELECTION DAY BALLOTS CAST | EARLY VOTING BALLOTS CAST | ABSENTEE BALLOTS CAST | 2004 |
| 2004 | ||||
| Nassau | 15,839 | 9,989 | 7,028 | 2004 |
| Okaloosa | 59,566 | 10,383 | 20,073 | 2004 |
| Okeechobee | 6,565 | 3,507 | 2,194 | 2004 |
| Orange | 243,536 | 79,171 | 68,899 | 2004 |
| Osceola | 52,525 | 19,120 | 10,550 | 2004 |
| Palm Beach | 404,666 | 49,831 | 92,428 | 2004 |
| Pasco | 127,526 | 29,584 | 35,059 | 2004 |
| Pinellas | 299,201 | 69,902 | 89,004 | 2004 |
| Polk | 148,150 | 6,023 | 59,651 | 2004 |
| Putnam | 22,775 | 3,775 | 4,608 | 2004 |
| Santa Rosa | 40,098 | 15,824 | 11,541 | 2004 |
| Sarasota | 122,101 | 33,872 | 40,838 | 2004 |
| Seminole | 135,161 | 20,671 | 30,853 | 2004 |
| St. Johns | 52,151 | 18,758 | 15,378 | 2004 |
| St. Lucie | 58,795 | 22,882 | 18,896 | 2004 |
| Sumter | 11,382 | 16,517 | 4,180 | 2004 |
| Suwannee | 10,666 | 2,466 | 2,736 | 2004 |
| Taylor | 4,671 | 2,421 | 1,533 | 2004 |
| Union | 2,836 | 369 | 1,509 | 2004 |
| Volusia | 144,109 | 40,166 | 44,214 | 2004 |
| Wakulla | 5,372 | 4,776 | 1,671 | 2004 |
| Walton | 12,996 | 6,462 | 4,611 | 2004 |
| Washington | 5,676 | 2,653 | 2,173 | 2004 |
| 2004 | ||||
| STATE TOTAL | 4,865,283 | 1,428,362 | 1,352,447 | 2004 |
It appears that this data frame is in tidy form but there are still a few issues that need to be fixed. For instance the column names are in the first row and the second row appears to be empty across all columns except for year. Also, what we’re seeing here will be repeated throughout df3_6 each time a new data frame is added as each of df3 to df6 are identically formatted.Thus, we cannot simply fix the first two rows of df3_6 and consider it tidy. Fortunately, formatting the data for 2004 to 2010 will not be as challenging as formatting the data for 2012 onwards. We’ll follow a simplified process from the previous sections to get this data frame in tidy form.
#rename columns
colnames(df3_6) <- c("county","election_day","early_voting", "absentee","year")
df3_6$county <- tolower(df3_6$county)
to_exclude <- which(str_detect(df3_6$county,"county|totals|state total"))
df3_6 <- df3_6[-to_exclude, ]
df3_6 <- df3_6[order(df3_6$county),] #order putting empty rows at top
df3_6 <- df3_6[-c(1:12),]
df3_6 <- df3_6[order(df3_6$year),] #order by year
kable(df3_6, row.names = FALSE) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "400px")
| county | election_day | early_voting | absentee | year |
|---|---|---|---|---|
| alachua | 44570 | 16549 | 13567 | 2010 |
| baker | 4312 | 2748 | 1013 | 2010 |
| bay | 27101 | 17978 | 10148 | 2010 |
| bradford | 5298 | 1746 | 1145 | 2010 |
| brevard | 125316 | 23387 | 49967 | 2010 |
| broward | 264978 | 85768 | 76023 | 2010 |
| calhoun | 2894 | 1008 | 469 | 2010 |
| charlotte | 23093 | 17514 | 19749 | 2010 |
| citrus | 27442 | 15141 | 13133 | 2010 |
| clay | 37780 | 18424 | 7684 | 2010 |
| collier | 47001 | 29282 | 28290 | 2010 |
| columbia | 9481 | 6472 | 3149 | 2010 |
| desoto | 4008 | 1792 | 1227 | 2010 |
| dixie | 3380 | 910 | 1117 | 2010 |
| duval | 160334 | 52279 | 48260 | 2010 |
| escambia | 59410 | 16991 | 18941 | 2010 |
| flagler | 18203 | 10713 | 4906 | 2010 |
| franklin | 2204 | 1035 | 877 | 2010 |
| gadsden | 10044 | 4597 | 2136 | 2010 |
| gilchrist | 3607 | 906 | 952 | 2010 |
| glades | 2075 | 176 | 555 | 2010 |
| gulf | 2420 | 1867 | 807 | 2010 |
| hamilton | 2471 | 701 | 897 | 2010 |
| hardee | 2873 | 1796 | 655 | 2010 |
| hendry | 4046 | 1657 | 940 | 2010 |
| hernando | 36729 | 10313 | 12501 | 2010 |
| highlands | 15830 | 10237 | 4351 | 2010 |
| hillsborough | 178323 | 77351 | 65167 | 2010 |
| holmes | 3624 | 1322 | 1260 | 2010 |
| indian river | 24286 | 12582 | 11477 | 2010 |
| jackson | 8293 | 4157 | 2727 | 2010 |
| jefferson | 3773 | 1673 | 828 | 2010 |
| lafayette | 1614 | 619 | 324 | 2010 |
| lake | 56958 | 26280 | 20691 | 2010 |
| lee | 101711 | 36876 | 44754 | 2010 |
| leon | 63435 | 18594 | 18412 | 2010 |
| levy | 7901 | 1671 | 3343 | 2010 |
| liberty | 1468 | 517 | 242 | 2010 |
| madison | 3691 | 1533 | 1087 | 2010 |
| manatee | 73158 | 9765 | 24667 | 2010 |
| marion | 71925 | 19737 | 24821 | 2010 |
| martin | 25818 | 12070 | 17435 | 2010 |
| miami-dade | 268211 | 96884 | 134598 | 2010 |
| monroe | 14469 | 4058 | 7971 | 2010 |
| nassau | 13870 | 8128 | 5120 | 2010 |
| okaloosa | 36520 | 13482 | 9058 | 2010 |
| okeechobee | 4685 | 2175 | 1511 | 2010 |
| orange | 141692 | 51769 | 82355 | 2010 |
| osceola | 29050 | 16394 | 8985 | 2010 |
| palm beach | 265127 | 62640 | 61502 | 2010 |
| pasco | 85976 | 31124 | 21828 | 2010 |
| pinellas | 140751 | 10068 | 158668 | 2010 |
| polk | 100149 | 22754 | 39937 | 2010 |
| putnam | 14510 | 3841 | 3378 | 2010 |
| santa rosa | 29863 | 11952 | 7369 | 2010 |
| sarasota | 82013 | 38276 | 27551 | 2010 |
| seminole | 72165 | 33205 | 29481 | 2010 |
| st. johns | 39285 | 24443 | 11370 | 2010 |
| st. lucie | 41522 | 15259 | 20083 | 2010 |
| sumter | 14824 | 21694 | 6100 | 2010 |
| suwannee | 8339 | 2684 | 2603 | 2010 |
| taylor | 4103 | 1222 | 1388 | 2010 |
| union | 2378 | 501 | 796 | 2010 |
| volusia | 98368 | 29520 | 32621 | 2010 |
| wakulla | 6053 | 2997 | 2056 | 2010 |
| walton | 10816 | 5246 | 3090 | 2010 |
| washington | 4771 | 1656 | 1743 | 2010 |
| alachua | 49,077 | 53,556 | 22,003 | 2008 |
| baker | 4,193 | 5,328 | 1,562 | 2008 |
| bay | 33,385 | 29,404 | 16,038 | 2008 |
| bradford | 5,525 | 4,430 | 1,740 | 2008 |
| brevard | 168,547 | 51,879 | 63,479 | 2008 |
| broward | 345,855 | 252,061 | 132,268 | 2008 |
| calhoun | 3,392 | 2,154 | 712 | 2008 |
| charlotte | 23,461 | 33,580 | 28,536 | 2008 |
| citrus | 30,002 | 28,011 | 18,400 | 2008 |
| clay | 40,397 | 39,141 | 12,605 | 2008 |
| collier | 53,022 | 52,734 | 36,840 | 2008 |
| columbia | 9,917 | 12,945 | 5,190 | 2008 |
| desoto | 4,520 | 3,761 | 1,888 | 2008 |
| dixie | 4,052 | 1,742 | 1,544 | 2008 |
| duval | 151,203 | 183,410 | 76,202 | 2008 |
| escambia | 66,512 | 49,180 | 29,991 | 2008 |
| flagler | 18,671 | 22,457 | 7,949 | 2008 |
| franklin | 2,629 | 1,943 | 1,469 | 2008 |
| gadsden | 8,829 | 10,559 | 3,064 | 2008 |
| gilchrist | 4,242 | 2,192 | 1,360 | 2008 |
| glades | 2,785 | 595 | 894 | 2008 |
| gulf | 2,608 | 3,321 | 1,251 | 2008 |
| hamilton | 2,893 | 1,542 | 1,129 | 2008 |
| hardee | 3,202 | 3,271 | 973 | 2008 |
| hendry | 4,402 | 4,872 | 1,652 | 2008 |
| hernando | 45,231 | 23,093 | 19,875 | 2008 |
| highlands | 16,203 | 20,806 | 8,088 | 2008 |
| hillsborough | 246,303 | 146,518 | 114,765 | 2008 |
| holmes | 4,315 | 2,587 | 1,733 | 2008 |
| indian river | 25,670 | 27,933 | 16,792 | 2008 |
| jackson | 11,271 | 6,967 | 3,323 | 2008 |
| jefferson | 3,800 | 3,007 | 1,199 | 2008 |
| lafayette | 1,801 | 1,041 | 530 | 2008 |
| lake | 60,393 | 57,528 | 29,446 | 2008 |
| lee | 123,872 | 64,081 | 82,030 | 2008 |
| leon | 77,194 | 42,432 | 27,810 | 2008 |
| levy | 10,146 | 4,254 | 4,406 | 2008 |
| liberty | 1,850 | 1,028 | 432 | 2008 |
| madison | 3,980 | 3,454 | 1,471 | 2008 |
| manatee | 96,193 | 20,177 | 35,515 | 2008 |
| marion | 81,455 | 47,683 | 33,419 | 2008 |
| martin | 24,844 | 28,285 | 25,603 | 2008 |
| miami-dade | 372,299 | 326,358 | 170,820 | 2008 |
| monroe | 14,113 | 13,608 | 12,043 | 2008 |
| nassau | 10,837 | 18,976 | 8,231 | 2008 |
| okaloosa | 41,232 | 33,340 | 14,841 | 2008 |
| okeechobee | 5,531 | 4,782 | 2,536 | 2008 |
| orange | 201,361 | 145,278 | 119,288 | 2008 |
| osceola | 43,695 | 41,305 | 14,641 | 2008 |
| palm beach | 323,264 | 144,545 | 124,218 | 2008 |
| pasco | 109,789 | 68,858 | 37,143 | 2008 |
| pinellas | 237,430 | 46,385 | 180,873 | 2008 |
| polk | 131,739 | 60,064 | 52,289 | 2008 |
| putnam | 17,663 | 9,952 | 5,468 | 2008 |
| santa rosa | 24,447 | 26,275 | 19,626 | 2008 |
| sarasota | 82,880 | 70,693 | 30,732 | 2008 |
| seminole | 85,801 | 77,781 | 11,398 | 2008 |
| st. johns | 35,905 | 49,241 | 53,087 | 2008 |
| st. lucie | 47,018 | 43,691 | 42,092 | 2008 |
| sumter | 11,678 | 28,587 | 8,822 | 2008 |
| suwannee | 9,079 | 5,246 | 3,399 | 2008 |
| taylor | 4,576 | 2,835 | 1,986 | 2008 |
| union | 2,674 | 1,386 | 1,282 | 2008 |
| volusia | 124,715 | 68,371 | 50,724 | 2008 |
| wakulla | 5,494 | 6,386 | 2,519 | 2008 |
| walton | 12,435 | 9,649 | 4,891 | 2008 |
| washington | 5,687 | 3,138 | 2,377 | 2008 |
| alachua | 47,367 | 13,418 | 10,456 | 2006 |
| baker | 4,033 | 1,608 | 737 | 2006 |
| bay | 27,548 | 13,488 | 7,607 | 2006 |
| bradford | 5,214 | 1,121 | 908 | 2006 |
| brevard | 185,074 | 12,725 | 28,085 | 2006 |
| broward | 288,224 | 78,003 | 44,872 | 2006 |
| calhoun | 2,573 | 522 | 317 | 2006 |
| charlotte | 28,900 | 17,096 | 9,784 | 2006 |
| citrus | 32,033 | 9,968 | 10,035 | 2006 |
| clay | 37,442 | 8,736 | 5,797 | 2006 |
| collier | 48,059 | 20,436 | 17,847 | 2006 |
| columbia | 10,128 | 3,397 | 2,226 | 2006 |
| desoto | 4,371 | 1,192 | 1,109 | 2006 |
| dixie | 3,669 | 799 | 704 | 2006 |
| duval | 173,925 | 34,193 | 30,046 | 2006 |
| escambia | 62,294 | 8,392 | 17,209 | 2006 |
| flagler | 19,060 | 7,488 | 3,312 | 2006 |
| franklin | 2,468 | 763 | 764 | 2006 |
| gadsden | 12,552 | 2,603 | 1,640 | 2006 |
| gilchrist | 3,867 | 908 | 673 | 2006 |
| glades | 2,304 | 140 | 547 | 2006 |
| gulf | 3,068 | 1,242 | 627 | 2006 |
| hamilton | 2,515 | 382 | 755 | 2006 |
| hardee | 2,828 | 1,218 | 516 | 2006 |
| hendry | 3,639 | 1,120 | 753 | 2006 |
| hernando | 40,672 | 8,206 | 8,410 | 2006 |
| highlands | 17,599 | 8,948 | 4,043 | 2006 |
| hillsborough | 184,379 | 54,051 | 53,072 | 2006 |
| holmes | 2,419 | 1,236 | 930 | 2006 |
| indian river | 27,204 | 10,538 | 6,181 | 2006 |
| jackson | 9,707 | 2,008 | 1,592 | 2006 |
| jefferson | 3,905 | 1,336 | 837 | 2006 |
| lafayette | 1,444 | 403 | 370 | 2006 |
| lake | 58,530 | 16,744 | 11,845 | 2006 |
| lee | 108,042 | 21,914 | 25,491 | 2006 |
| leon | 68,544 | 7,689 | 14,763 | 2006 |
| levy | 8,474 | 778 | 2,215 | 2006 |
| liberty | 1,840 | 377 | 204 | 2006 |
| madison | 4,777 | 1,234 | 611 | 2006 |
| manatee | 77,538 | 7,078 | 16,044 | 2006 |
| marion | 75,043 | 10,736 | 16,046 | 2006 |
| martin | 28,298 | 15,946 | 9,217 | 2006 |
| miami-dade | 279,910 | 66,190 | 65,955 | 2006 |
| monroe | 14,983 | 2,700 | 7,012 | 2006 |
| nassau | 14,101 | 5,527 | 3,061 | 2006 |
| okaloosa | 37,869 | 9,171 | 9,720 | 2006 |
| okeechobee | 5,408 | 1,712 | 1,261 | 2006 |
| orange | 149,732 | 37,052 | 36,232 | 2006 |
| osceola | 33,752 | 5,806 | 5,429 | 2006 |
| palm beach | 255,922 | 59,720 | 54,954 | 2006 |
| pasco | 94,458 | 19,020 | 16,814 | 2006 |
| pinellas | 207,960 | 35,338 | 50,620 | 2006 |
| polk | 100,360 | 13,753 | 28,910 | 2006 |
| putnam | 15,672 | 1,807 | 2,495 | 2006 |
| santa rosa | 28,637 | 7,976 | 6,108 | 2006 |
| sarasota | 88,927 | 30,832 | 22,831 | 2006 |
| seminole | 73,999 | 21,450 | 15,314 | 2006 |
| st. johns | 41,314 | 11,805 | 8,187 | 2006 |
| st. lucie | 45,553 | 13,132 | 12,124 | 2006 |
| sumter | 12,193 | 15,266 | 3,137 | 2006 |
| suwannee | 8,167 | 1,101 | 1,840 | 2006 |
| taylor | 5,665 | 833 | 1,013 | 2006 |
| union | 2,153 | 154 | 627 | 2006 |
| volusia | 111,432 | 19,122 | 23,069 | 2006 |
| wakulla | 9,674 | 2,335 | 1,213 | 2006 |
| walton | 10,869 | 3,110 | 2,437 | 2006 |
| washington | 4,959 | 1,239 | 1,202 | 2006 |
| alachua | 76,381 | 16,389 | 18,956 | 2004 |
| baker | 5,879 | 2,843 | 1,333 | 2004 |
| bay | 37,981 | 18,572 | 18,799 | 2004 |
| bradford | 6,871 | 2,429 | 1,614 | 2004 |
| brevard | 185,305 | 30,564 | 50,261 | 2004 |
| broward | 431,488 | 176,743 | 100,388 | 2004 |
| calhoun | 3,718 | 1,447 | 850 | 2004 |
| charlotte | 46,741 | 13,659 | 19,803 | 2004 |
| citrus | 34,177 | 20,117 | 15,527 | 2004 |
| clay | 57,209 | 6,770 | 17,786 | 2004 |
| collier | 57,323 | 44,155 | 27,984 | 2004 |
| columbia | 14,490 | 7,339 | 3,262 | 2004 |
| desoto | 5,393 | 2,310 | 1,831 | 2004 |
| dixie | 4,462 | 1,091 | 926 | 2004 |
| duval | 258,746 | 58,693 | 64,010 | 2004 |
| escambia | 95,647 | 12,807 | 35,760 | 2004 |
| flagler | 23,694 | 8,873 | 6,008 | 2004 |
| franklin | 3,111 | 1,586 | 1,285 | 2004 |
| gadsden | 12,024 | 5,467 | 3,629 | 2004 |
| gilchrist | 4,514 | 1,616 | 922 | 2004 |
| glades | 3,101 | 345 | 750 | 2004 |
| gulf | 4,099 | 1,725 | 1,489 | 2004 |
| hamilton | 3,147 | 937 | 1,060 | 2004 |
| hardee | 3,400 | 2,831 | 1,073 | 2004 |
| hendry | 5,817 | 2,481 | 1,537 | 2004 |
| hernando | 50,942 | 13,942 | 15,920 | 2004 |
| highlands | 23,351 | 11,385 | 7,168 | 2004 |
| hillsborough | 313,930 | 86,642 | 64,857 | 2004 |
| holmes | 5,418 | 1,987 | 970 | 2004 |
| indian river | 32,015 | 17,328 | 12,210 | 2004 |
| jackson | 12,655 | 4,677 | 2,613 | 2004 |
| jefferson | 3,979 | 2,486 | 1,040 | 2004 |
| lafayette | 1,935 | 873 | 548 | 2004 |
| lake | 68,121 | 32,331 | 24,063 | 2004 |
| lee | 161,953 | 32,182 | 47,669 | 2004 |
| leon | 95,397 | 17,974 | 23,000 | 2004 |
| levy | 11,748 | 2,078 | 2,939 | 2004 |
| liberty | 1,823 | 864 | 366 | 2004 |
| madison | 4,905 | 2,446 | 1,007 | 2004 |
| manatee | 106,917 | 10,274 | 27,052 | 2004 |
| marion | 99,628 | 14,799 | 25,938 | 2004 |
| martin | 43,892 | 12,243 | 16,746 | 2004 |
| miami-dade | 435,887 | 244,022 | 100,780 | 2004 |
| monroe | 19,706 | 8,918 | 11,091 | 2004 |
| nassau | 15,839 | 9,989 | 7,028 | 2004 |
| okaloosa | 59,566 | 10,383 | 20,073 | 2004 |
| okeechobee | 6,565 | 3,507 | 2,194 | 2004 |
| orange | 243,536 | 79,171 | 68,899 | 2004 |
| osceola | 52,525 | 19,120 | 10,550 | 2004 |
| palm beach | 404,666 | 49,831 | 92,428 | 2004 |
| pasco | 127,526 | 29,584 | 35,059 | 2004 |
| pinellas | 299,201 | 69,902 | 89,004 | 2004 |
| polk | 148,150 | 6,023 | 59,651 | 2004 |
| putnam | 22,775 | 3,775 | 4,608 | 2004 |
| santa rosa | 40,098 | 15,824 | 11,541 | 2004 |
| sarasota | 122,101 | 33,872 | 40,838 | 2004 |
| seminole | 135,161 | 20,671 | 30,853 | 2004 |
| st. johns | 52,151 | 18,758 | 15,378 | 2004 |
| st. lucie | 58,795 | 22,882 | 18,896 | 2004 |
| sumter | 11,382 | 16,517 | 4,180 | 2004 |
| suwannee | 10,666 | 2,466 | 2,736 | 2004 |
| taylor | 4,671 | 2,421 | 1,533 | 2004 |
| union | 2,836 | 369 | 1,509 | 2004 |
| volusia | 144,109 | 40,166 | 44,214 | 2004 |
| wakulla | 5,372 | 4,776 | 1,671 | 2004 |
| walton | 12,996 | 6,462 | 4,611 | 2004 |
| washington | 5,676 | 2,653 | 2,173 | 2004 |
Now that we have formatted the data for all the years let’s combine all the data frames and check to ensure that the county names are consistent across years
df <- rbind(df_12,df7,df3_6)
df$county <- tolower(df$county)
unique(df$county)
## [1] "alachua" "baker" "bay" "bradford"
## [5] "brevard" "broward" "calhoun" "charlotte"
## [9] "citrus" "clay" "collier" "columbia"
## [13] "desoto" "dixie" "duval" "escambia"
## [17] "flagler" "franklin" "gadsden" "gilchrist"
## [21] "glades" "gulf" "hamilton" "hardee"
## [25] "hendry" "hernando" "highlands" "hillsborough"
## [29] "holmes" "indian river" "jackson" "jefferson"
## [33] "lafayette" "lake" "lee" "leon"
## [37] "levy" "liberty" "madison" "manatee"
## [41] "marion" "martin" "miamiâ dade" "monroe"
## [45] "nassau" "okaloosa" "okeechobee" "orange"
## [49] "osceola" "palm beach" "pasco" "pinellas"
## [53] "polk" "putnam" "santa rosa" "sarasota"
## [57] "seminole" "st johns" "st lucie" "sumter"
## [61] "suwannee" "taylor" "union" "volusia"
## [65] "wakulla" "walton" "washington" "miami-dade"
## [69] "st. johns" "st. lucie"
We see that there are two different spellings for St Johns and St Lucie. Let’s correct this and also verify that each county occurs a total 7 times. Once for each year.
df$county <- gsub("st johns","st. johns",df$county)
df$county <- gsub("st lucie", "st. lucie", df$county)
unique(table(df$county))
## [1] 7 5 2
By wrapping table inside of unique we were able to determine that the count for each county has a value of 7 for each of the seven years. Also, let’s confirm that we do have the correct years.
unique(df$year)
## [1] 2016 2014 2012 2010 2008 2006 2004
## Levels: 2016 2014 2012 2010 2008 2006 2004
Great, now let’s glimpse the data again to confirm that everything is in order.
glimpse(df)
## Observations: 469
## Variables: 5
## $ year <fct> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2...
## $ county <chr> "alachua", "baker", "bay", "bradford", "brevard",...
## $ election_day <chr> "50491", "3584", "25579", "5095", "109623", "2131...
## $ early_voting <chr> "51097", "7180", "43963", "4675", "118427", "4265...
## $ absentee <chr> "29040", "2003", "18952", "2450", "92708", "20290...
Oops! It appears that the numeric columns are treated as characters. Let’s check the last few rows of the data to see if we can detect the problem.
tail(df)
## year county election_day early_voting absentee
## 285 2004 taylor 4,671 2,421 1,533
## 286 2004 union 2,836 369 1,509
## 287 2004 volusia 144,109 40,166 44,214
## 288 2004 wakulla 5,372 4,776 1,671
## 289 2004 walton 12,996 6,462 4,611
## 290 2004 washington 5,676 2,653 2,173
It appears that the comma is embedded in some of the values, for the earlier years, causing numeric values to be interpreted as characters. Let’s remove the comma and check the data once more.
df[,c(3:5)] <- apply(df[,c(3:5)], 2, function(x) as.numeric(gsub(",","",as.character(x))))
kable(df,row.names = FALSE) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "400px")
| year | county | election_day | early_voting | absentee |
|---|---|---|---|---|
| 2016 | alachua | 50491 | 51097 | 29040 |
| 2016 | baker | 3584 | 7180 | 2003 |
| 2016 | bay | 25579 | 43963 | 18952 |
| 2016 | bradford | 5095 | 4675 | 2450 |
| 2016 | brevard | 109623 | 118427 | 92708 |
| 2016 | broward | 213112 | 426592 | 202904 |
| 2016 | calhoun | 2448 | 2692 | 1053 |
| 2016 | charlotte | 23477 | 39644 | 35019 |
| 2016 | citrus | 26556 | 29323 | 25105 |
| 2016 | clay | 33064 | 54533 | 20165 |
| 2016 | collier | 42681 | 74932 | 57159 |
| 2016 | columbia | 9132 | 14044 | 5884 |
| 2016 | desoto | 4274 | 4452 | 2216 |
| 2016 | dixie | 3011 | 2319 | 1935 |
| 2016 | duval | 136287 | 227475 | 76053 |
| 2016 | escambia | 59696 | 56213 | 39791 |
| 2016 | flagler | 15191 | 29696 | 13537 |
| 2016 | franklin | 2264 | 2314 | 1533 |
| 2016 | gadsden | 8359 | 9627 | 4317 |
| 2016 | gilchrist | 4233 | 2412 | 1865 |
| 2016 | glades | 2720 | 863 | 831 |
| 2016 | gulf | 2293 | 3612 | 1456 |
| 2016 | hamilton | 2430 | 1702 | 1392 |
| 2016 | hardee | 3241 | 3618 | 829 |
| 2016 | hendry | 3597 | 5810 | 1861 |
| 2016 | hernando | 35004 | 25911 | 34558 |
| 2016 | highlands | 16149 | 18996 | 11197 |
| 2016 | hillsborough | 173637 | 242184 | 191201 |
| 2016 | holmes | 3669 | 2901 | 2044 |
| 2016 | indian river | 25359 | 33136 | 23196 |
| 2016 | jackson | 7970 | 9116 | 4179 |
| 2016 | jefferson | 3553 | 2883 | 1317 |
| 2016 | lafayette | 1653 | 1147 | 624 |
| 2016 | lake | 65105 | 75197 | 32672 |
| 2016 | lee | 74735 | 100893 | 158302 |
| 2016 | leon | 55731 | 67518 | 31792 |
| 2016 | levy | 9287 | 5108 | 5182 |
| 2016 | liberty | 1497 | 1347 | 508 |
| 2016 | madison | 2792 | 4554 | 1261 |
| 2016 | manatee | 64556 | 48604 | 68494 |
| 2016 | marion | 72776 | 66191 | 38146 |
| 2016 | martin | 23118 | 34038 | 30375 |
| 2016 | miamiâ dade | 217803 | 476001 | 311597 |
| 2016 | monroe | 15597 | 13325 | 14398 |
| 2016 | nassau | 13480 | 23515 | 10294 |
| 2016 | okaloosa | 34348 | 45186 | 23318 |
| 2016 | okeechobee | 5393 | 5879 | 2596 |
| 2016 | orange | 141015 | 255113 | 169031 |
| 2016 | osceola | 34044 | 63254 | 45228 |
| 2016 | palm beach | 278030 | 241376 | 152226 |
| 2016 | pasco | 100110 | 77027 | 68565 |
| 2016 | pinellas | 168563 | 78177 | 255374 |
| 2016 | polk | 116350 | 92598 | 80091 |
| 2016 | putnam | 12996 | 13856 | 6621 |
| 2016 | santa rosa | 32823 | 37197 | 19143 |
| 2016 | sarasota | 75810 | 79817 | 77811 |
| 2016 | seminole | 56794 | 109262 | 63291 |
| 2016 | st. johns | 39611 | 71617 | 27329 |
| 2016 | st. lucie | 44579 | 57289 | 41389 |
| 2016 | sumter | 12517 | 45545 | 19812 |
| 2016 | suwannee | 8006 | 6490 | 4374 |
| 2016 | taylor | 4011 | 3099 | 2257 |
| 2016 | union | 2412 | 2317 | 1025 |
| 2016 | volusia | 91647 | 97936 | 74939 |
| 2016 | wakulla | 5548 | 6824 | 3152 |
| 2016 | walton | 13519 | 13439 | 7141 |
| 2016 | washington | 5050 | 3675 | 2509 |
| 2014 | alachua | 41973 | 17983 | 19091 |
| 2014 | baker | 4350 | 2621 | 1653 |
| 2014 | bay | 26898 | 17278 | 14225 |
| 2014 | bradford | 4997 | 2062 | 1640 |
| 2014 | brevard | 118826 | 33327 | 73765 |
| 2014 | broward | 223095 | 129248 | 123820 |
| 2014 | calhoun | 2297 | 1258 | 701 |
| 2014 | charlotte | 23534 | 16463 | 28194 |
| 2014 | citrus | 25484 | 13800 | 19777 |
| 2014 | clay | 36089 | 19082 | 13699 |
| 2014 | collier | 42921 | 31768 | 40603 |
| 2014 | columbia | 8658 | 6161 | 3828 |
| 2014 | desoto | 3844 | 2158 | 1622 |
| 2014 | dixie | 2827 | 1196 | 1489 |
| 2014 | duval | 143441 | 73604 | 55759 |
| 2014 | escambia | 53510 | 18421 | 27453 |
| 2014 | flagler | 15407 | 13984 | 9091 |
| 2014 | franklin | 2253 | 998 | 1128 |
| 2014 | gadsden | 9099 | 5162 | 3533 |
| 2014 | gilchrist | 3624 | 1035 | 1405 |
| 2014 | glades | 2057 | 245 | 775 |
| 2014 | gulf | 2288 | 1830 | 1090 |
| 2014 | hamilton | 2057 | 662 | 1049 |
| 2014 | hardee | 1855 | 1842 | 789 |
| 2014 | hendry | 3100 | 2281 | 1400 |
| 2014 | hernando | 31957 | 9781 | 23154 |
| 2014 | highlands | 15466 | 9126 | 7900 |
| 2014 | hillsborough | 156529 | 88379 | 131659 |
| 2014 | holmes | 3039 | 1256 | 1512 |
| 2014 | indian river | 24047 | 14749 | 14783 |
| 2014 | jackson | 7349 | 4354 | 3644 |
| 2014 | jefferson | 3550 | 1587 | 1192 |
| 2014 | lafayette | 1500 | 693 | 480 |
| 2014 | lake | 60379 | 28798 | 24980 |
| 2014 | lee | 70887 | 38077 | 104345 |
| 2014 | leon | 58289 | 27885 | 22939 |
| 2014 | levy | 7222 | 1993 | 4315 |
| 2014 | liberty | 1514 | 815 | 390 |
| 2014 | madison | 3250 | 2122 | 1040 |
| 2014 | manatee | 62249 | 13898 | 45344 |
| 2014 | marion | 67987 | 20715 | 33109 |
| 2014 | martin | 23313 | 14615 | 24621 |
| 2014 | miamiâ dade | 227275 | 116936 | 189324 |
| 2014 | monroe | 13633 | 5665 | 9818 |
| 2014 | nassau | 14388 | 8679 | 7888 |
| 2014 | okaloosa | 33757 | 14877 | 13409 |
| 2014 | okeechobee | 4665 | 2228 | 2147 |
| 2014 | orange | 130688 | 71962 | 110637 |
| 2014 | osceola | 31266 | 15866 | 22544 |
| 2014 | palm beach | 241002 | 93593 | 86952 |
| 2014 | pasco | 83556 | 32455 | 46912 |
| 2014 | pinellas | 133828 | 18987 | 204706 |
| 2014 | polk | 104849 | 29473 | 60309 |
| 2014 | putnam | 12596 | 5422 | 5041 |
| 2014 | santa rosa | 30964 | 11982 | 10785 |
| 2014 | sarasota | 77364 | 30903 | 55160 |
| 2014 | seminole | 66202 | 36368 | 45222 |
| 2014 | st. johns | 44479 | 26845 | 16572 |
| 2014 | st. lucie | 41534 | 19632 | 29399 |
| 2014 | sumter | 16260 | 25471 | 13941 |
| 2014 | suwannee | 6947 | 2424 | 3498 |
| 2014 | taylor | 3738 | 1206 | 1826 |
| 2014 | union | 2396 | 1713 | 1055 |
| 2014 | volusia | 88743 | 33259 | 56719 |
| 2014 | wakulla | 5911 | 3176 | 2599 |
| 2014 | walton | 11495 | 4763 | 4311 |
| 2014 | washington | 4454 | 1846 | 2065 |
| 2012 | alachua | 54204 | 39271 | 27393 |
| 2012 | baker | 4101 | 3601 | 1880 |
| 2012 | bay | 29255 | 33857 | 17797 |
| 2012 | bradford | 5869 | 3958 | 1887 |
| 2012 | brevard | 144657 | 51449 | 91170 |
| 2012 | broward | 341637 | 245952 | 171785 |
| 2012 | calhoun | 2822 | 2273 | 1112 |
| 2012 | charlotte | 24979 | 26688 | 33653 |
| 2012 | citrus | 26650 | 24686 | 23031 |
| 2012 | clay | 36171 | 41692 | 18842 |
| 2012 | collier | 49089 | 52290 | 48646 |
| 2012 | columbia | 9271 | 13177 | 4829 |
| 2012 | desoto | 4694 | 3138 | 2107 |
| 2012 | dixie | 3776 | 1262 | 1959 |
| 2012 | duval | 157365 | 174420 | 82913 |
| 2012 | escambia | 67630 | 42861 | 39647 |
| 2012 | flagler | 20070 | 19906 | 10651 |
| 2012 | franklin | 2618 | 1470 | 1437 |
| 2012 | gadsden | 8895 | 9598 | 4123 |
| 2012 | gilchrist | 4249 | 1935 | 1789 |
| 2012 | glades | 2541 | 498 | 943 |
| 2012 | gulf | 2591 | 3046 | 1517 |
| 2012 | hamilton | 2659 | 1442 | 1355 |
| 2012 | hardee | 3346 | 2846 | 1083 |
| 2012 | hendry | 4537 | 3846 | 1933 |
| 2012 | hernando | 40030 | 12752 | 27806 |
| 2012 | highlands | 16103 | 16991 | 9806 |
| 2012 | hillsborough | 205283 | 166990 | 171666 |
| 2012 | holmes | 4091 | 2391 | 1892 |
| 2012 | indian river | 23841 | 28646 | 19630 |
| 2012 | jackson | 8913 | 7777 | 4371 |
| 2012 | jefferson | 3904 | 2481 | 1454 |
| 2012 | lafayette | 1892 | 962 | 589 |
| 2012 | lake | 63751 | 57612 | 30793 |
| 2012 | lee | 125346 | 51932 | 89638 |
| 2012 | leon | 72848 | 45110 | 30316 |
| 2012 | levy | 9816 | 3653 | 5108 |
| 2012 | liberty | 1617 | 1210 | 496 |
| 2012 | madison | 3608 | 3669 | 1411 |
| 2012 | manatee | 91799 | 16357 | 45880 |
| 2012 | marion | 80570 | 43209 | 39052 |
| 2012 | martin | 21739 | 26335 | 31425 |
| 2012 | miami-dade | 406841 | 235916 | 247285 |
| 2012 | monroe | 14491 | 11106 | 13778 |
| 2012 | nassau | 11658 | 18339 | 10772 |
| 2012 | okaloosa | 40005 | 32847 | 22496 |
| 2012 | okeechobee | 5895 | 3525 | 2975 |
| 2012 | orange | 205018 | 127583 | 137125 |
| 2012 | osceola | 41532 | 41736 | 26484 |
| 2012 | palm beach | 351600 | 124833 | 130532 |
| 2012 | pasco | 94321 | 60974 | 59648 |
| 2012 | pinellas | 172605 | 39569 | 250113 |
| 2012 | polk | 128094 | 55120 | 66392 |
| 2012 | putnam | 14169 | 10622 | 6743 |
| 2012 | santa rosa | 37931 | 22639 | 16648 |
| 2012 | sarasota | 78328 | 55473 | 73876 |
| 2012 | seminole | 89594 | 66721 | 53189 |
| 2012 | st. johns | 39631 | 52565 | 23515 |
| 2012 | st. lucie | 48960 | 37353 | 37041 |
| 2012 | sumter | 16559 | 30269 | 14028 |
| 2012 | suwannee | 8941 | 4593 | 4276 |
| 2012 | taylor | 4490 | 2518 | 2148 |
| 2012 | union | 2645 | 1662 | 1139 |
| 2012 | volusia | 110545 | 61097 | 63821 |
| 2012 | wakulla | 5940 | 5431 | 3321 |
| 2012 | walton | 13206 | 10088 | 5319 |
| 2012 | washington | 5120 | 3279 | 2636 |
| 2010 | alachua | 44570 | 16549 | 13567 |
| 2010 | baker | 4312 | 2748 | 1013 |
| 2010 | bay | 27101 | 17978 | 10148 |
| 2010 | bradford | 5298 | 1746 | 1145 |
| 2010 | brevard | 125316 | 23387 | 49967 |
| 2010 | broward | 264978 | 85768 | 76023 |
| 2010 | calhoun | 2894 | 1008 | 469 |
| 2010 | charlotte | 23093 | 17514 | 19749 |
| 2010 | citrus | 27442 | 15141 | 13133 |
| 2010 | clay | 37780 | 18424 | 7684 |
| 2010 | collier | 47001 | 29282 | 28290 |
| 2010 | columbia | 9481 | 6472 | 3149 |
| 2010 | desoto | 4008 | 1792 | 1227 |
| 2010 | dixie | 3380 | 910 | 1117 |
| 2010 | duval | 160334 | 52279 | 48260 |
| 2010 | escambia | 59410 | 16991 | 18941 |
| 2010 | flagler | 18203 | 10713 | 4906 |
| 2010 | franklin | 2204 | 1035 | 877 |
| 2010 | gadsden | 10044 | 4597 | 2136 |
| 2010 | gilchrist | 3607 | 906 | 952 |
| 2010 | glades | 2075 | 176 | 555 |
| 2010 | gulf | 2420 | 1867 | 807 |
| 2010 | hamilton | 2471 | 701 | 897 |
| 2010 | hardee | 2873 | 1796 | 655 |
| 2010 | hendry | 4046 | 1657 | 940 |
| 2010 | hernando | 36729 | 10313 | 12501 |
| 2010 | highlands | 15830 | 10237 | 4351 |
| 2010 | hillsborough | 178323 | 77351 | 65167 |
| 2010 | holmes | 3624 | 1322 | 1260 |
| 2010 | indian river | 24286 | 12582 | 11477 |
| 2010 | jackson | 8293 | 4157 | 2727 |
| 2010 | jefferson | 3773 | 1673 | 828 |
| 2010 | lafayette | 1614 | 619 | 324 |
| 2010 | lake | 56958 | 26280 | 20691 |
| 2010 | lee | 101711 | 36876 | 44754 |
| 2010 | leon | 63435 | 18594 | 18412 |
| 2010 | levy | 7901 | 1671 | 3343 |
| 2010 | liberty | 1468 | 517 | 242 |
| 2010 | madison | 3691 | 1533 | 1087 |
| 2010 | manatee | 73158 | 9765 | 24667 |
| 2010 | marion | 71925 | 19737 | 24821 |
| 2010 | martin | 25818 | 12070 | 17435 |
| 2010 | miami-dade | 268211 | 96884 | 134598 |
| 2010 | monroe | 14469 | 4058 | 7971 |
| 2010 | nassau | 13870 | 8128 | 5120 |
| 2010 | okaloosa | 36520 | 13482 | 9058 |
| 2010 | okeechobee | 4685 | 2175 | 1511 |
| 2010 | orange | 141692 | 51769 | 82355 |
| 2010 | osceola | 29050 | 16394 | 8985 |
| 2010 | palm beach | 265127 | 62640 | 61502 |
| 2010 | pasco | 85976 | 31124 | 21828 |
| 2010 | pinellas | 140751 | 10068 | 158668 |
| 2010 | polk | 100149 | 22754 | 39937 |
| 2010 | putnam | 14510 | 3841 | 3378 |
| 2010 | santa rosa | 29863 | 11952 | 7369 |
| 2010 | sarasota | 82013 | 38276 | 27551 |
| 2010 | seminole | 72165 | 33205 | 29481 |
| 2010 | st. johns | 39285 | 24443 | 11370 |
| 2010 | st. lucie | 41522 | 15259 | 20083 |
| 2010 | sumter | 14824 | 21694 | 6100 |
| 2010 | suwannee | 8339 | 2684 | 2603 |
| 2010 | taylor | 4103 | 1222 | 1388 |
| 2010 | union | 2378 | 501 | 796 |
| 2010 | volusia | 98368 | 29520 | 32621 |
| 2010 | wakulla | 6053 | 2997 | 2056 |
| 2010 | walton | 10816 | 5246 | 3090 |
| 2010 | washington | 4771 | 1656 | 1743 |
| 2008 | alachua | 49077 | 53556 | 22003 |
| 2008 | baker | 4193 | 5328 | 1562 |
| 2008 | bay | 33385 | 29404 | 16038 |
| 2008 | bradford | 5525 | 4430 | 1740 |
| 2008 | brevard | 168547 | 51879 | 63479 |
| 2008 | broward | 345855 | 252061 | 132268 |
| 2008 | calhoun | 3392 | 2154 | 712 |
| 2008 | charlotte | 23461 | 33580 | 28536 |
| 2008 | citrus | 30002 | 28011 | 18400 |
| 2008 | clay | 40397 | 39141 | 12605 |
| 2008 | collier | 53022 | 52734 | 36840 |
| 2008 | columbia | 9917 | 12945 | 5190 |
| 2008 | desoto | 4520 | 3761 | 1888 |
| 2008 | dixie | 4052 | 1742 | 1544 |
| 2008 | duval | 151203 | 183410 | 76202 |
| 2008 | escambia | 66512 | 49180 | 29991 |
| 2008 | flagler | 18671 | 22457 | 7949 |
| 2008 | franklin | 2629 | 1943 | 1469 |
| 2008 | gadsden | 8829 | 10559 | 3064 |
| 2008 | gilchrist | 4242 | 2192 | 1360 |
| 2008 | glades | 2785 | 595 | 894 |
| 2008 | gulf | 2608 | 3321 | 1251 |
| 2008 | hamilton | 2893 | 1542 | 1129 |
| 2008 | hardee | 3202 | 3271 | 973 |
| 2008 | hendry | 4402 | 4872 | 1652 |
| 2008 | hernando | 45231 | 23093 | 19875 |
| 2008 | highlands | 16203 | 20806 | 8088 |
| 2008 | hillsborough | 246303 | 146518 | 114765 |
| 2008 | holmes | 4315 | 2587 | 1733 |
| 2008 | indian river | 25670 | 27933 | 16792 |
| 2008 | jackson | 11271 | 6967 | 3323 |
| 2008 | jefferson | 3800 | 3007 | 1199 |
| 2008 | lafayette | 1801 | 1041 | 530 |
| 2008 | lake | 60393 | 57528 | 29446 |
| 2008 | lee | 123872 | 64081 | 82030 |
| 2008 | leon | 77194 | 42432 | 27810 |
| 2008 | levy | 10146 | 4254 | 4406 |
| 2008 | liberty | 1850 | 1028 | 432 |
| 2008 | madison | 3980 | 3454 | 1471 |
| 2008 | manatee | 96193 | 20177 | 35515 |
| 2008 | marion | 81455 | 47683 | 33419 |
| 2008 | martin | 24844 | 28285 | 25603 |
| 2008 | miami-dade | 372299 | 326358 | 170820 |
| 2008 | monroe | 14113 | 13608 | 12043 |
| 2008 | nassau | 10837 | 18976 | 8231 |
| 2008 | okaloosa | 41232 | 33340 | 14841 |
| 2008 | okeechobee | 5531 | 4782 | 2536 |
| 2008 | orange | 201361 | 145278 | 119288 |
| 2008 | osceola | 43695 | 41305 | 14641 |
| 2008 | palm beach | 323264 | 144545 | 124218 |
| 2008 | pasco | 109789 | 68858 | 37143 |
| 2008 | pinellas | 237430 | 46385 | 180873 |
| 2008 | polk | 131739 | 60064 | 52289 |
| 2008 | putnam | 17663 | 9952 | 5468 |
| 2008 | santa rosa | 24447 | 26275 | 19626 |
| 2008 | sarasota | 82880 | 70693 | 30732 |
| 2008 | seminole | 85801 | 77781 | 11398 |
| 2008 | st. johns | 35905 | 49241 | 53087 |
| 2008 | st. lucie | 47018 | 43691 | 42092 |
| 2008 | sumter | 11678 | 28587 | 8822 |
| 2008 | suwannee | 9079 | 5246 | 3399 |
| 2008 | taylor | 4576 | 2835 | 1986 |
| 2008 | union | 2674 | 1386 | 1282 |
| 2008 | volusia | 124715 | 68371 | 50724 |
| 2008 | wakulla | 5494 | 6386 | 2519 |
| 2008 | walton | 12435 | 9649 | 4891 |
| 2008 | washington | 5687 | 3138 | 2377 |
| 2006 | alachua | 47367 | 13418 | 10456 |
| 2006 | baker | 4033 | 1608 | 737 |
| 2006 | bay | 27548 | 13488 | 7607 |
| 2006 | bradford | 5214 | 1121 | 908 |
| 2006 | brevard | 185074 | 12725 | 28085 |
| 2006 | broward | 288224 | 78003 | 44872 |
| 2006 | calhoun | 2573 | 522 | 317 |
| 2006 | charlotte | 28900 | 17096 | 9784 |
| 2006 | citrus | 32033 | 9968 | 10035 |
| 2006 | clay | 37442 | 8736 | 5797 |
| 2006 | collier | 48059 | 20436 | 17847 |
| 2006 | columbia | 10128 | 3397 | 2226 |
| 2006 | desoto | 4371 | 1192 | 1109 |
| 2006 | dixie | 3669 | 799 | 704 |
| 2006 | duval | 173925 | 34193 | 30046 |
| 2006 | escambia | 62294 | 8392 | 17209 |
| 2006 | flagler | 19060 | 7488 | 3312 |
| 2006 | franklin | 2468 | 763 | 764 |
| 2006 | gadsden | 12552 | 2603 | 1640 |
| 2006 | gilchrist | 3867 | 908 | 673 |
| 2006 | glades | 2304 | 140 | 547 |
| 2006 | gulf | 3068 | 1242 | 627 |
| 2006 | hamilton | 2515 | 382 | 755 |
| 2006 | hardee | 2828 | 1218 | 516 |
| 2006 | hendry | 3639 | 1120 | 753 |
| 2006 | hernando | 40672 | 8206 | 8410 |
| 2006 | highlands | 17599 | 8948 | 4043 |
| 2006 | hillsborough | 184379 | 54051 | 53072 |
| 2006 | holmes | 2419 | 1236 | 930 |
| 2006 | indian river | 27204 | 10538 | 6181 |
| 2006 | jackson | 9707 | 2008 | 1592 |
| 2006 | jefferson | 3905 | 1336 | 837 |
| 2006 | lafayette | 1444 | 403 | 370 |
| 2006 | lake | 58530 | 16744 | 11845 |
| 2006 | lee | 108042 | 21914 | 25491 |
| 2006 | leon | 68544 | 7689 | 14763 |
| 2006 | levy | 8474 | 778 | 2215 |
| 2006 | liberty | 1840 | 377 | 204 |
| 2006 | madison | 4777 | 1234 | 611 |
| 2006 | manatee | 77538 | 7078 | 16044 |
| 2006 | marion | 75043 | 10736 | 16046 |
| 2006 | martin | 28298 | 15946 | 9217 |
| 2006 | miami-dade | 279910 | 66190 | 65955 |
| 2006 | monroe | 14983 | 2700 | 7012 |
| 2006 | nassau | 14101 | 5527 | 3061 |
| 2006 | okaloosa | 37869 | 9171 | 9720 |
| 2006 | okeechobee | 5408 | 1712 | 1261 |
| 2006 | orange | 149732 | 37052 | 36232 |
| 2006 | osceola | 33752 | 5806 | 5429 |
| 2006 | palm beach | 255922 | 59720 | 54954 |
| 2006 | pasco | 94458 | 19020 | 16814 |
| 2006 | pinellas | 207960 | 35338 | 50620 |
| 2006 | polk | 100360 | 13753 | 28910 |
| 2006 | putnam | 15672 | 1807 | 2495 |
| 2006 | santa rosa | 28637 | 7976 | 6108 |
| 2006 | sarasota | 88927 | 30832 | 22831 |
| 2006 | seminole | 73999 | 21450 | 15314 |
| 2006 | st. johns | 41314 | 11805 | 8187 |
| 2006 | st. lucie | 45553 | 13132 | 12124 |
| 2006 | sumter | 12193 | 15266 | 3137 |
| 2006 | suwannee | 8167 | 1101 | 1840 |
| 2006 | taylor | 5665 | 833 | 1013 |
| 2006 | union | 2153 | 154 | 627 |
| 2006 | volusia | 111432 | 19122 | 23069 |
| 2006 | wakulla | 9674 | 2335 | 1213 |
| 2006 | walton | 10869 | 3110 | 2437 |
| 2006 | washington | 4959 | 1239 | 1202 |
| 2004 | alachua | 76381 | 16389 | 18956 |
| 2004 | baker | 5879 | 2843 | 1333 |
| 2004 | bay | 37981 | 18572 | 18799 |
| 2004 | bradford | 6871 | 2429 | 1614 |
| 2004 | brevard | 185305 | 30564 | 50261 |
| 2004 | broward | 431488 | 176743 | 100388 |
| 2004 | calhoun | 3718 | 1447 | 850 |
| 2004 | charlotte | 46741 | 13659 | 19803 |
| 2004 | citrus | 34177 | 20117 | 15527 |
| 2004 | clay | 57209 | 6770 | 17786 |
| 2004 | collier | 57323 | 44155 | 27984 |
| 2004 | columbia | 14490 | 7339 | 3262 |
| 2004 | desoto | 5393 | 2310 | 1831 |
| 2004 | dixie | 4462 | 1091 | 926 |
| 2004 | duval | 258746 | 58693 | 64010 |
| 2004 | escambia | 95647 | 12807 | 35760 |
| 2004 | flagler | 23694 | 8873 | 6008 |
| 2004 | franklin | 3111 | 1586 | 1285 |
| 2004 | gadsden | 12024 | 5467 | 3629 |
| 2004 | gilchrist | 4514 | 1616 | 922 |
| 2004 | glades | 3101 | 345 | 750 |
| 2004 | gulf | 4099 | 1725 | 1489 |
| 2004 | hamilton | 3147 | 937 | 1060 |
| 2004 | hardee | 3400 | 2831 | 1073 |
| 2004 | hendry | 5817 | 2481 | 1537 |
| 2004 | hernando | 50942 | 13942 | 15920 |
| 2004 | highlands | 23351 | 11385 | 7168 |
| 2004 | hillsborough | 313930 | 86642 | 64857 |
| 2004 | holmes | 5418 | 1987 | 970 |
| 2004 | indian river | 32015 | 17328 | 12210 |
| 2004 | jackson | 12655 | 4677 | 2613 |
| 2004 | jefferson | 3979 | 2486 | 1040 |
| 2004 | lafayette | 1935 | 873 | 548 |
| 2004 | lake | 68121 | 32331 | 24063 |
| 2004 | lee | 161953 | 32182 | 47669 |
| 2004 | leon | 95397 | 17974 | 23000 |
| 2004 | levy | 11748 | 2078 | 2939 |
| 2004 | liberty | 1823 | 864 | 366 |
| 2004 | madison | 4905 | 2446 | 1007 |
| 2004 | manatee | 106917 | 10274 | 27052 |
| 2004 | marion | 99628 | 14799 | 25938 |
| 2004 | martin | 43892 | 12243 | 16746 |
| 2004 | miami-dade | 435887 | 244022 | 100780 |
| 2004 | monroe | 19706 | 8918 | 11091 |
| 2004 | nassau | 15839 | 9989 | 7028 |
| 2004 | okaloosa | 59566 | 10383 | 20073 |
| 2004 | okeechobee | 6565 | 3507 | 2194 |
| 2004 | orange | 243536 | 79171 | 68899 |
| 2004 | osceola | 52525 | 19120 | 10550 |
| 2004 | palm beach | 404666 | 49831 | 92428 |
| 2004 | pasco | 127526 | 29584 | 35059 |
| 2004 | pinellas | 299201 | 69902 | 89004 |
| 2004 | polk | 148150 | 6023 | 59651 |
| 2004 | putnam | 22775 | 3775 | 4608 |
| 2004 | santa rosa | 40098 | 15824 | 11541 |
| 2004 | sarasota | 122101 | 33872 | 40838 |
| 2004 | seminole | 135161 | 20671 | 30853 |
| 2004 | st. johns | 52151 | 18758 | 15378 |
| 2004 | st. lucie | 58795 | 22882 | 18896 |
| 2004 | sumter | 11382 | 16517 | 4180 |
| 2004 | suwannee | 10666 | 2466 | 2736 |
| 2004 | taylor | 4671 | 2421 | 1533 |
| 2004 | union | 2836 | 369 | 1509 |
| 2004 | volusia | 144109 | 40166 | 44214 |
| 2004 | wakulla | 5372 | 4776 | 1671 |
| 2004 | walton | 12996 | 6462 | 4611 |
| 2004 | washington | 5676 | 2653 | 2173 |
Finally, the data is in tidy format! Let’s pause for a round of applause.
Let’s visualize the data to reveal any trends.
data <- gather(df, "vote_type", count, 3:5, -c(1,2))
to_plot <- data %>%
group_by(year,vote_type) %>%
summarise(total = sum(count)) %>%
mutate(percentage = total/sum(total))
to_plot$year <- factor(to_plot$year, levels = c("2004","2006","2008","2010","2012","2014","2016"))
ggplot(to_plot, aes(x = year, y = total, fill = vote_type))+
geom_col()+
labs(x = 'Year',y = 'Total Voters', title = 'Florida\'s\ Yearly Voter Turnout\n (2004 - 2016)',
fill = 'Vote Method')+
scale_y_continuous(labels = comma)+
geom_text(aes(label = paste0(round(percentage*100),"%")), position = position_stack(vjust = 0.5))+
scale_fill_brewer(palette = "Dark2")+
theme_minimal()+
theme(plot.title = element_text(hjust = 0.5))
Based on the above chart we observe a few things:
The turnout for years 2004, 2008, 2012 and 2016 is higher than the other years. This is because these were presidential election years and people are more likely to vote in a presidential election than in a midterm general election.
There is an upward trend to the data, especially for presidential election years. The turnout in 2016 was almost 10,000,000.
Election day voting has been decreasing for each of the presidential and midterm elections. In 2004, 64% of voters voted on election day but only 31% of voters voted on election day in 2016. This is about a 50% drop in election day voting.
Early voting accounted for 40% of all voters in 2016 compared to 28% in 2012. The 2016 presidential election was somewhat of a political storm and turnout was expected to be significantly higher than usual. As a result, a larger portion of people may have chosen early voting to avoid the hassle of voting on election day.
At times the data needed for analysis is not readily available. As data analysts, data scientists, and sometimes statisticians it is part of our jobs to be equipped with the skills needed to extract and tidy data in both structured and unstructured format. This post provided a brief introduction on how we can extract structured data from web-based files, tidy and then visualize the data to gain insights.
The R Markdown script used to create this file can be found here. Additionally, an interactive chart was created to allow the user to view the turnout within a specific county and can be found here.